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ABSTRACT 

Robert L. Thorndike was awarded the Educational 
Testing Service (ETS) Measurement Award at the 1972 ETS Invitational 
Conference. In "Heredity, Environment, and Class or Ethnic 
Differences," J. McV. Hunt addressed several fundamental questions 
pertaining to the hereditary and environmental influences of the 
observed social class and ethnic differences on intelligence and 
scholastic achievement tests* Eleanor E, Maccoby and Carol Nagy 
. Jacklin, in "Sex Differences in Intellectual Functioning, summarized 
the current state of knowledge regarding sex differneces in verbal, 
mathematical, and spatial abilities. In his paper on "Implications of 
Group Differences for Test Interpretation," Lloyd G. Humphreys 
examined the concept of test fairness. Charles V, Willie discussed 
the previous papcjrs in "A Theoretical Approach to Cultural and 
Biological Differences." In "Recycling the Problems in Testing," 
Henry S. Dyer observed that much of the misuse of tests arises from 
the diverse perceptions held makers, givers, takers, an-^d users of 
tests* The afternoon session focused on educational implementation: 
Irvinq £• Sigel spoke on "Where is Preschool Education Going: Or Are 
We En Route Without a Road Map?," and Beniamin F. Payton discussed 
"Black Colleges and Black Studies." (EW) 
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Foreword 



The pluralistic nature of our society poses a fornnclable 
challenge to those who attempt to measure human capabilities 
and evaluate human achievements. As Chairman of the 1972 
Invitational Conference, Anpc Anastasi faced this challenge 
squarely. She succeeded in assembling a group of speakers 
who arc eminently qualified to speak authoritatively on the 
basic issues related to the measurement and evaluation prob- 
lems of a pluralistic society. The large, attentive audience and 
the searching questions addressed to the speakers attest to her 
talent for delineating issues and finding the right people to 
interpret them. 

Each of the speakers m his or her own way conveyed a sense 
of both continuity and urgency. Our problems are no: new, 
and much of the relevant information has been available for 
many years: yet they resist sohition. All of the speakers, and 
particularly flenry Dyer, who chose the recurrence of prob- 
lems as a central theme for his luncheon speech, remind us 
that despite their intransigence these problems deserve the 
best cHort we can muster. 

Last year I expressed my hope that this decade would see a 
new and much-needed s>nthesis of education and measure- 
ment. 1 believe that the 1972 Conference represents a signifi- 
cant step in thai direction. 

iVil/iam IV, Turnbull 
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Preface 



The general theme of the \^)72 Itn national Confcicnce on Testing 
Problems was Assessmcnf in a PfurafiMic Soticiy. This is a broad 
theme that coukl be considered from the viewpoint of many disei- 
plines and could concentrate on man) JilTcrent concerns and prob- 
lems. The approach follovsed in this Confcreneo was to inquire into the 
contributions that basic research in psvehology and related sciences 
can make toward the improvement of assessment. The underlying 
question was: Hou can our present kno\\ ledge about the nature and 
origins of indindual and ^iroup diflerences in behavior improve the 
development and use of assessment procedures and the interpretation 
of assessment results'* The lerm '^assessment" was chosen advisedly, 
to cover not only testing but also other procedures for observing and 
evaluating human behavior. Although much of what the speakers re- 
ported was expressed with referenec to tests, their conclusions and the 
implications of their findings generally applv equally well to other 
observational techniques, sueh as interviews, ratings, and records of 
job or school performance. 

Psychometrics has made rapid progress in the refinement of tech- 
niques of test eonstruetion and the statistical evaluation of test re- 
sults. At the same time, the inereasing spceiaii/ation of both scholarly 
and professional activities has tended to dissoeiate ps>^hologieal test- 
ing from the mainstream of eontemporarv psychology. As psycho- 
metricians concentrate more and more on the testing procesjj^ they 
may lose sight of the behavior they set out to measure. As a Result, 
outmoded interpretations of test seores may remain insulated from 
the iindings of subsequent behavioral research. The widespread mis- 
conceptions about the so-called i(j provide a particularly flagrant 
example of such a dissociation. One still hears the term "lo'* used as 
though it referred, not to a test score, but to a property of the organ- 
ism. It was a major objeci.ve of the 1972 Invitational Conference to 
encourage a rapproehemcnt between testing and behavioral research 
and to illustrate some points of contact between the two fields. 

In the morning session the focu^ was on understanding: What do 
we know about human behaviDr and what docs sueh knowledg^e imply 
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for assessment? The three papers pro\uleJ examples of the t\pe of 
answers that conteiiiporar\ ps\Lholoir\ tan gi\c to these questions In 
the opening paper. J. VUV. Hunt adJrcsseJ hnnsclf to se\eral funda- 
mental questions pertaining to the operation of hereJitar\ aiui en- 
vironmental faetors in the de\eU>pincnt of obser\e*.! soeial-<.las , aiul 
ethnic clilTercnees on intellii:en<.e aiul s<.holastie aehie\ement tests. 
The discussion was illustrated wilh resear<.h on the inlUience of early 
experience on the performance of antmals anJ human infants. Su« h 
concepts as hentability aiul iQ^onstancv were analyze*.!. 

The second speakei. Lleanor Maccob\. summari/eJ the current 
state of knowledge regarding se\ ditlcrcni^cs in \erbaL mathematical, 
and spatial abilities, with particular attention to de\elopmental 
changes from .fanc\ to adolescence. The age-old question of sex 
diJTcrences m \ariabtlit\ was reexamined in the light of recent data. 
Finally, in a discussion of the biologHwal and experiential origins of 
intellectual sex dilVerciKcs. Dr. Man.oK Lriucall) re\ tewed several 
etiological hypotheses and pointed to the need for more solid data to 
test ^ueh hypotheses. 

In his paper on "ImplKations of (Iroup Difiercnces for fest Inter- 
pretation.** Llo>d numphrc>s cxanwncd the concept of "'test btas" or 
"test fairness." witl. specific reference to the accuracv of inferences 
drawn from test scores for males and females, ethnic minorittes, and 
socioeconomic and. regional subgroups. Various proposed solutions 
for reducing test bias m sclcctton were compared and criticall) evalu- 
ated. Dr. Humphrevs emphasized the practical \alue of tests in iden- 
tifying existing learning deficiencies. whate\er their cause, and ob- 
ser\ed that unfairness resuhs not from the use of tests but from the 
incorrect nitcrpretation of test scores as hieasurcs of learning po- 
tential and of fixed, innate capacities 

In order to broaden the perspectne and proMde a fresh look at the 
topic from a ditVerent angle, a repiesentatt\e of a related field was 
chosen as discussant for the three preceding papers Charlos \Vi1Ik\ 
Professor of Sociologv and Vice President for Student AlVairs at 
Syracuse Uni\ersitv. was invited to perform this challengmg task. Dr. 
Willie cited evidence reported bv each of the speakers in support of 
the plasticitv of human ps)cholo*gical de\elopment and its suscepti- 
bility to environmental intlucnccs. He also referred to the biological 
value of genetic polv morphisni within a human population— a value 
that parallels the cultural advantages of a pluralistic societv. His own 
research on the pcrlormancc of black students in predonunantly 




white colleges demonstrated the overlapping ranges of intellectual 
capacity found among blacks and whites. 

In the luncheon address, Henr> D>er pointed out that ps>ehomelri- 
cians and educators have for man> vears expressed concern about the 
misuse of tests. Today, however, the problem has assumed new mag- 
nitude, involving potential misinterpretation of test results not only by 
school personnel but also b> politicians and their pluralistic constit- 
uencies. Observing that the misuse of tests arises largely from the 
diverse perceptions of testing held by the makers, givers, takers, and 
users of tests, he illustrated his point with regard to two of the six 
possible pairs among these four classes: makcsMisers and givers- 
takers. To the previously cited dissociation between psychometrics 
and behavioral research, Dr Dyer thus added other dimensions of 
compartmenlalization. the practical consequences of which may be 
even more serious and far-reaching. 

In the afternoon session, ihe focus was on educational implemen- 
tation. Assessment is not usually an end in itself, but rather a means 
to some other, practical end. The development of appropriate educa- 
tional programs is one major objective of asses>ment. Accordingly, 
the two afternoon speakers eonsidered educational progranis designed 
to meet the diverse needs of a pluralistic culture.The specific programs 
discussed were drawn from widely separated levels of the educaiional 
ladder: from the preschool in one case and from college m the other, 

Irving Sigel called attention to the resurgence of interest in early 
childhood education in the 1960s— an interest loo often manifested, 
however, in poorly planned crash programs that led to negative re- 
sults and mounting disillusionment. In contrast, a greatly expanded 
knowledge base from psychology related disciplines is now avail- 
able. Dr. Sigel argued convincingly for three innovations: sophisti- 
cated rcconceptualization of devd >pmental processes, the use of 
specific achievement tests to measure progress in the skills developed 
by educational programs (instead of such global scores as "igs"). 
and a reformulation of the goals of carlv childhood education in more 
realistic terms. In the appendix lo his talk, these points are iMustrated 
by reference to his own ongoing research in the Carly Childhood Edu- 
cation Project at the State University of New York at Buffalo, 

In the closing paper of the Conferenec. Benjamin Pa>ton presented 
black colleges and black studies progranis as means of preserving 
cultural pluralism. He noted that, historically, black colleges have 
been primarily responsible for the educational advancement of the 
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black population in America; and he argued that this type of institu- 
tion still has an important role in current efforts to achieve equality of 
status for minority groups. . 

For the -success of this Invitational Conference, primary credit is 
due to the sneakers and invited discussant for providing illuminating 
and potent distillates of their many years of research and thought on 
the topics they covered. Thanks are also extended to the distinguished 
audience for its active interest and support, and especially to those 
who contributed searching and provocative questions from the floor. 
Some of these questions, together with the speakers' replies, are repro- 
duced. From the Conference Chairman's special observation post, it 
IS abundantly clear that the professional quality and smooth operation 
of the Conference are attributable to. the imaginative, competent, and 
tireless activity of many members of the vrs staff. To name individuals 
would produce a very long list indeed, and even then some would be 
missed who had toiled mightily behind the scenes to bring off once 
more what has become a major annual event in psychometric circles. 
At the risk of sounding like the traditional acknowledgments in tke 
front pages of doctoral uissertations, however, I feel impelled to 
record a special indebtedness to Anna D'ragositz, without whose 
expertise, judgment, and good humor in the face of the inevitable 
frustrations and delays this Conference could not have taken place. 
My numerous contacts with her were invariably productive and a 
source of genuine personal pleasure for me. 

Anne Anastasi 

CHAIRMAN 



xiv 



Robert L, Thorndike {left) receiving the 1972 ets Measuhement Award from ets 
President Turnbidl at the Conference luncheon. 




One president to anotlier: Mr. Tunihull congratulates educom President Henry 
Chauncey, Chairman-elect oftf/e 1973 Invitational Conference on Testing Problems. 
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EDUCATIONAL TliiSTING SERVICE 

Measurcmeni Award 



The ETS Award foi Distinguished Service to Measurement was estab- 
lished in 1970, to be presented annually to an individual whose work 
and career have had a major impact op developments in educational 
and psychological measurement. The 1972 Award was presented at 
the Conference by ETS President William W. Turnbull to Professor 
Robert Ladd Thorjidike with the following citation: 

« « 
A need to get to the heart of the matter and the intelligence to make it 

possible have been the foundation of Robe;t L, Thorndike's distin- 
guished career in education and measuremep'. , \ 

^ie has provided clear and logical formuhitlons in a number of impor- 
tant and complex areas, focusing on such concepts as the reliability of 
tests, the meaning and measurement of underachievement, and the 
assessment of test bias. A capacity for powerful analysis and precise 
writing has contributed to the effeciiveness of his leadership in the col- 
laborative revision of a major summative work. The American Council 
on Education's Educational Measurement. 

As a scholar, he has combined the theory of measurement with its 
practice in developing a variety of mstruments for assessing psychological 
traits and educational attainments. Further, through his gifts as a 
teacher and author, he has given thousands of students the skills to create 
and interpret instruments of their own. 

Robert Thorndike's research studies are distinguished b> their elegance 
and by the importance of the questions they address. Our understanding 
of the determinants of career choice and the correlates of career success, 
for example, has been furthered by his research. And we are indebted 
to him for his studies of such diverse areas as human problem solving 
and the comparative outcomes of education in different nations. 

For his many contributions to the theory and practice of educational 
measurement, trs is pleased to present the 1972 Award for Distinguished 
, Service to Measurement to Robert L. Thorndike. 
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Session I 

Origins and implications 
of Differences 
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Heredity, 
Environment, 
and Class or 
Ethnic Differences^ 



J. McV. Hunt 
University of Illinois 
Champaign 



\ 



That people difTer on the average according to the social classes and 
the ethnic or racial groups to which they belong is an ancient observa- 
^ tion. Within our own United States, the mean values or the incidence 
'ofbehayioral phenomena have been observed to vary significantly for 
nearly every characteristic where systematic measurement has been 
tried. Such characteristics include emotional adjustment as measured 
by both the Minnesota Multiphasic Personality Inventory (Gough, 
1946, 1948) and the Vineland Social Maturity Scale (Sims, 1954): the 
incidence of behavior disorders (Clark, 1949; Fans & Dunham, 1939; 
Landis & Page, 1938); the incidence of delinquent and criminal be- 
havior (Shaw, 1929; ^hort & Strodtbeck, 1965); persistencfe in goal 
striving (Battle & Rotter, 1963; Hertzig, et al., 1968; Zigler 8l Butter- 
field, 1968); the tendency to be impulsive rather than reflective 
(Meichenbaum & Goodman, 1969; Mumbauer& Miller, 1970); mea- 
sures of ability; and measures of both scholastic and occupational 
achievement. Such differences serve to define the spcial-class structure 
as described by Warner and his colleagues (see Warner, et aK, 1949). 
Differences among ethnic and racial, groups include a similar variety 
of characteristics plus differences of language, skin color, and cultural 
attitude. 



\ 



»The preparation of this paper was supported b> grants from the United States 
Public Health Service: MH-K6-I8567 and MH-1I32I. 
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Heredity and Environment 



Especially prominent have been the social-class, ethnic, and racial 
differences in performance on tests of intelligence and scholastic 
.achievement. Since the evidence is most abundant for such measures, 
they will be the focus of this paper. Where social class is the focus, the 
iQs of children entering school from professional families average 
about 115 while those from unskilled laborers average about 95 
(McNemar, 1942), and differences in mean IQ of this order of 20 
points are typical of those reported by other investigators (see Ana- 
stasi, 1958, Ch. 15 on Social Class Differences). Where race is the 
focus, black children usually average an IQ about one standard devia- 
tion, or 15 points below white children (Shuey, 1966). Moreover, 
Mayeske (1971) has described a special analysis of the data in the 
report on equality of educational opportunity by Coleman and 
others (1966) which shows that 25 percent of the total variance among 
American students in their academic achievement was associated with 
membership in one of six ethnic-racial groups; Indian, Mexican, 
native white, Negro, Oriental, and Puerto Rican. 

Although some individuals have risen out of poverty to top levels of 
excellence, there can be no blinking these class and race differences. 
They exist. But are they biologically inevitable to the degree in which 
they now manifest themselves? Are class differences an inevitable 
matter of competitive social selection which has resulted from geno- 
typic limits on potential? Or. might one expect n^ost of such differ- 
ences as exist from the conditions of child-rearing in the various 
social classes? Are the observed ethnic and racial differences merely a 
- matter of the relative frequency in which certain genes are distributed 
m these groups? Or. inasmuch as the predominant majority of Indians, 
Mexicans, Negroes, and Puerto Ricans have lived since birth in con- 
ditions of poverty in families of little education and little hope, might 
one justifiably expect what they show on the tests and in the schools 
from the conditions of their rearing? 

These questions are usually taken to involve the old issue of nature 
and nurture and to pose the question of their relative potency.. Let me 
begin by recognizing that heredity is clearly primary. For such a matter 
as whether a given fertilized ovum becomes a human being or a mem- 
be; of another species, heredity is all important. One gets only ele- 
phants from breeding elephants, nothing else. Moreover, each in- 
dividual begins life with a given complement of genes which he re- 
ceived in equal shares from each parent. The dna in his genes carries 
information which exerts a continuing influence on his development 
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throughout his life. Since a life begins with conception, heredity is 
always primary. Yet, having such a primary status and exerting a con- 
tinuing influence throughout life need imply neither genetically fixed 
traits nor a predetermined course and rate for later development. 



Ways of Obtaining Answers 

One approach to answering the questions raised here comes by way of 
definition. Cyrir Burt (1969) has recently contended that intelligence 
should be 'defined as "innate, general, cognitive ability/' Such a con- 
tention has been the dominant view in England ever since Sir Francis 
Galton (1869), Charles Darwin's younger cousin, conceived geneti- 
cally fixed traits as implicit in the theory of natural selection, launched 
the study of individual diff'erences with the publication of Hereditary 
Genius, and invented the correlation statistic to show the persistence 
of individual diff'erences across generations. Karl Pearson, Galton's 
successor, improved on the correlation method, extended the scope of 
such investigation, and used the evidence to support his own such 
definition (Pearson, 1902, 1904). 

Defining intelligence as innate ability has the defect of leaving it 
unmeasurable. Or, if scores on tests of Intelligence are taken as mea- 
sures of innate ability, this approach confuses an observable and mea- 
surable phcnotype with the genotype which can only be inferred— a 
distinction which Johannsen (1903, 1909), one of the fathers of scien- 
tific genetics, made at the turn of the century. Igaorance or neglect of 
this distinction has all too often left psychologists discussing measures 
of intelligence as if they were genotypic. Moreover, the traditional 
distinction between tests of intelligence , or aptitude from achievement 
tests tends to maintain such confusion. Here it'has been the merit of 
Lloyd Humphreys (1962) to recognize, that these two kinds of tests 
differ only in degree, with intelligence tests including a wider variety 
of items, tapping a wider variety of experience, and being further re- 
moved in time from relevant learning situations than achievement 
tests. « 

An approach by definition can sta^t quite as justifiably with the 
contention that intelligence is primarily a product of learning rather 
than its cause. Thus, George Ferguspn (1954, 1956) has explained 
abilities as derived from factor analysis as the results of transfer of 
training in ovcrlearned skills. Moreover, Gagne (1968) has viewed 
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mental ability as a product of cun.ulative learning in which various 
skills form a transfer hierarchy ranging from stimulus-response con- 
nections through chains, motor and verbal, multiple discriminauons, 
concepts, and ru/es, simple and complex. In the recognition of a 
hierarchical structure, Gagne's view resembles th^t of Piaget (1937, 
1947), and my ciwn (Hunt, 1961), yet Gagne's view gives less impor- 
tance than eith/r Piaget's or mine to the effect of the individual's exist- 
ing organization of abilities on the consequences of his encounters 
with specified circumstances. Clearly, a definitional approach, by it- 
self, can lead only to fruitless debate. The scores of all kinds of tests of 
abilities represent past developmental achievements in which the in- 
fluence of the genes has interacted continuously with the infiuc^ce of 
the environment in an on-going series of encounters. 

The second approach is empirical. It is based upon the correlations 
between test scores from relatives, and the operations are quite inde- 
pendent of these belief-based definitions. Such correlations have been 
found regularly to be a positive function of the degree of genetic re- 
latedness. Thus the correlations between monozygotic or identical 
twins are higher than those between dizygotic twins or siblili'igs, or 
those between parents and children, which, in turn, are higher than 
those between cousins, and those between first cousins are some- 
what higher than those between second cousins or unrelated children 
reared apart (see Erlenmeyer-Kimling & Jarvik, 1963; Fuller & 
Thompson, 1960, Ch. 7: Outhit, 1933). Moreover, the correlation bt^- 
tween the scores for siblings reared apart (r = +.34 in Freeman, 
Holzinger, & Mitchell, 1928; median r = +.47 from 33 studies in ^ 
Jensen, 1969, p. 49) is higher than that reported for first cousins (r = 
+ .06 in Gray & Moshinsky, 1933; median r = ^.26 from 3 studies' 
in Jensen, 1969). Such findings and others (see Jensen, 1969) clearly 
attest a genetic influence on phenotypic measures of intelligence. 

It has been more diflicult to say how great this influence is. The 
traditional efl^ort to separate and specify the relative strengths of the 
influences of heredity and environment has led to a variety of research 
and statistical designs beyond the scope of this essay (see Cattell, 1 953, 
I960; Falconer, 1960; Fullpi & Thompson, 1960). The answer to the 
question has come in the form of estimates of heritability. Heritability 
is defined as that proportion of the variance within a specific popula- 
tion in the phenotypic measure of a characteristic that is determined 
by the genotypic variation within that population (Rieger, Michaelis, 
& Green, 1968, p. 213). In selection experiments, a simple approxima- 
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tion of heritability may be obtained by dividing the gain in the off- 
spring by the selection differentiah This gain is the difference of the 
mean of the trait measures concerned for the offspring from the 
mean for the population^ The selection differential is the difference of 
the mean for those selected to be the parents from the mean of the 
population^ The closer this ratio is to unity, the higher the heritability. 

Since experimental selective manipulation of human matings has 
been out of the q.uestio(n for many reasons, estimates of heritability for 
human intelligence have been based upon statistical manipulations of 
the correlations between relatives. The findings have varied (see re- 
views by Erlenmeyer-Kimling & Jarvik, 1963; and by Fuller & 
Thompson, 1960). Galton (1883) began this effort with his studies of 
twins and concluded that **nature prevails enormously over nurture 
when the differences of nurture do not exceed what is commonly to 
be found among persons of the same rank of society and in the same 
county [p» 241]" altTicugh he lacked a numerical estimate. Such an 
estimate based upon the correlation between identical twins reared 
apart is conceptually the most obvious, if their environments are un- 
correlated, because they have only their genes in common. Newman, 
Freeman, and Holzinger (1937) reported a correlation of +»77 be- 
tween the Stanford-Binet iqs of their 19 pairs. More recently, accord- 
ing to Jensen (1969), Shields (1962) has also reported a correlation of 
+ .77 between composite scores from 44 pairs on Raven's Progressive 
Matrices, and most recently, Cyril Burt (1966) has reported a correla- 
tion of +.86 between the Stanford-Binet iQs of 53 pairs of identical 
English twins. Such findings support both Galton's statement and the 
oft-quoted numerical estimate that 80 percent of phenotypic variance 
in4he IQ derives from heredity. 

Other bases for estimating heritability yield estimates differing little 
from the correlation between measures of intelligence for identical 
twins. The median of the correlations between such measures in five 
samples of 'unrelated children reared together is +.24. Since all these 
have in common is their environments, this serves as an estimate of 
variance in phenotypic intelligence due to environment. Subtracted 
from 100 percent, it leaves 76 percent due to heredity. Another basis 
comes from an attempt to assess tlic extent of family resemblance with 
cultural difference^ held constant by means of what was contended to 
be a culture-free test. Again, approximately 80 percent of the variance 
in phenotypic measures of intelligence among families was attributed 
to heredity (Cattell & Willson. 1938), When Jensen ^967) applied his 
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generalized formula for estimating hcritability from any two correla- 
tion sets of relatives where the degree of kinship is higher for one 
than the other-to all those reported, he got a composite heritability 
value of +.77 (+.81 when corrected for unreliability). This he re- 
gards as the best overall estimate of the heritabilit> of human intelli- 
gence (Jensen. 1969, p. 43). (Just how any single estimate can be 
expected for such a population statistic which varies with the varia- 
bility of both genetic and environmental factors, he does not explain.) 

Before World War 11, textbooks of psychology often stated that 
heredity accounts for approximately 80 percent of the variance in the 
IQ. This is the assertion and the line of reasoning that I characterired, 
largely for purposes oCcxposition. as *'the belief in fixed intelligence 
(Hunt, I96l).*\ Along with this belief went another that behavior un- 
folds directly with genetically controlled maturation which 1 termed 
**the belief in predetermined development" and which I traced back 
to G. Stanley Hall and his elaboration of what he saw as the implica- 
tions of the recapitulation doctrine. This latter belief set the concep- 
tual stage for the claim of a constant iq and the notion that individual 
differences start with conception and remain essentially constant 
throughout development and life. It is basically this same iirgumqnt 
which Arthur Jensen (1969) has revisited to explain what has come 
to be characterized as 'nhe failure of Project Head Start/* Implied 
also is that proposition that if 80 percent of the \ariance in phenotypic 
measures of intelligence derives from heredity, then most of those 
differences obser\-i between the means of the iQs for children 4)f pro- 
fessional and unskilled parents or the children '^f p::icr.**i in the racial 
groupings must be biologically inevitable. 



The Dissonant Evidence of Developmental Plasticity 

This claini that 80 percent of the variance in phenotypic measures pf 
intelligence and developmental advancement is attributable to hered- 
ity makes other evidence of developmental plasticity highly puzzling. 
Bits of such evidence began turning up even before World War II. 
They ineluded the increases in the iQs of preschool children associated 
with nursery schooling in the Iowa studies (Skeels, Updegraph. et al . 
1938: Wellman. 1940). lack of longitudinal predictive value for scores 
from tests given infants during their Trrst two years even though these 
scores exhibited satisfactory reliability (Bayley. 1940), and the finding 
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by Skeels and Dye (1939) of dramatic increases in the IQS of orphan- 
age-reared infants who were moved to an institution for the mentally 
retarded where **the older and brighter girls on the ward becarne very 
much attached to the children and would play with them during most 
of their waking hours" (p. 5: see Hunt, 1961). Instead of calling the 
beliefs in '*fixcd intelligence'' and •^predetermined development" into 
question, such was the firmness of faith in them that such evidences 
merely evoked a flood of methodplogical priticisms tending to dis- 
credit them» 

More of such evidence has come since World War IL Spitz (1945, 
1946a, 1946b) and Dennis (1960) reported apathy and dramatic re- 
tardation, even in locomotion, associated with orphanage rearing. 

.Students of neonatal behavior turned up unsuspected evidences of 
ability for both classical and operant conditioning during the first few 
days following birth (for reviews see Lipsitt, 1963. 1966, 1967). In- 
vestigations stemming from the neuropsychological theorizing of 
Donald Hebb (1949) brought evidence that rearing chimpanzees in 
the dark hampers the development of their visually controlled be- 
haviors (Riescn, 1947, 1958) and that increasing the complexity of 
early perceptual experiences of rats increaseh substantially their later 
ability to learn the Hebb- Williams (1946) mazes (Forgays & Forgays, 
1952; Forgus, 1954; Hymovitch, 1952). According to this same princi- 
ple, both rats (Hebb, 1947) and dogs (Thompson & Heron, 1954) 
which were pet-reared became significantly superior as adults to their 
cage-reared littermates in learning these mazes. For many skeptics, 
such evidences from the animal laboratory obviated selectivity and 
regression toward the mean, two of the criticisms commonly leveled 
agamst the evidence from the orphanage and nursery-school studies, 
as the basis for such early efl'ects of experience on development. Thus, 
they lent also a cubit of credibility to the orphanage evidence. 

Other evidence has suggested that the use of a sensorimotor organi- 
zation solidifies its structure and hastens its development. In one of 
these studies, providing infants, beginning at five weeks of age, with a 
stabile pattern over their cribs to look at reduced the age at which 
the blink response appeared, to target drops of 1 1.5 inches, from 10.4 
weeks of age for 10 control infants without such an opportunity to use 
their eyes, to an average of seven weeks for 10 infants provided with 
stabile patterns (Greenberg, Uzgiris, & Hunt, 1968). In another such 

TivSnption, providing infants with visual targets of complexity 
properly matched to their level of development reduced the age at 
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which thcy Aichicvcd mature reaching for a seen target from a median 
of 145 days, for infants in an original normative study, to a median of 
89 days for the infants in a second enrichment study where the com- 
plexity of the visual stabile was properly matched to the development 
of the children (White/ 1967). If one castN thcj^e findings into the terms 
of Stern's (1912) IQ ratio in order to pul them in familiar perspective, 
lo>vering the age for the appearance of the blink-response from 10,4 
weeks to 7 weeks is a gain of approximately 48 points, and lowering 
the achievement of that visuomotor coordination in mature reaching 
from a median of 145 days to a median of 89 days is a gain in the order 
of 63 points. Such a transiormation applies only to past development 
and should imply no permanence unless the circumstances of these 
infants are so arranged as to gi\c them special opportunities to ac- 
commodatc^heir advanced visuomotor skills to new situations calling 
for further development. 

Evidence of plasticity comes also for abilities more closely akin to 
tested intelligence and scholastic proficiency in what Piaget (!936» 
1937) has termed object construction and imitation. We have de- 
veloped a set of six sequentially ordinal scales (Uzgiris & Hunt. !973), 
One is for object permanence and involves what is probably the most 
basic epistomologieal construction, an the other to be discussed here 
is for the development of vocal imitation. 

Such ordinal scales^enable one to define a level within a line of 
development existing between the top landmark passed and the first 
one failed. In cross-sectional studies, one can then compare theagcs of 
children at given levels of development who are living under JilTering 
environmental conditions. In Athens are two orphanages for illegiti- 
mate babies- a municipal orphanage where the infant-caretaker ratio 
approximates lOyL and Metcra. which attempts to be a model baby 
center, where this ratio is of the order of .\ I. Whether an infant comes 
to one or the other of these orphanages appears to be a matter of 
chance. We have compared the ages of all the children from llicse two 
orphanages who were at the several levels on these scales of object 
permanence and vocal imitation, we also included a sample of chil- 
dren from workmg-class families. Tfte results for object permanence 
appear in Figure I. For illustrative purposes consider the means and 
standard deviations for that level of object permanence where a child 
follows an object through one hidden displacement but not through a 
succession of them. These appear in the left three of the middle cluster 
of vertical bars. For the Municipal Orphanage, the tallest bar on the 
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left, the mean age for the five children at this level is 143.76 weeks with^ 
a standard deviation of 29.19 weeks; for Metera the mean, also for 
five children, is 94.39 weeks with a standard deviation of 1 1 . 1 3 weeks, 
and for the home-reared children, third bar, the mean age for the 16 
children at this level is 87.9 weeks with a standard deviation of 28.06. 
The children of the Municipal Orphanage average significantly older 
than those from either Metcra or the working-class homes who differ 
not significantly. 

Consider alsoJhat level of object permanence at which a child fol- 
lows an object through a series of hidden displacements and reverses 
the order in which the container of the desired object disappeared but 
fails to copy a series of five sticks decreasing in length from bottom to 
top. This is the top level on our scale of object permanence. For the 
* Municipal Orphanage, the mean age for seven children at this level is 
206.58 weeks with a standard deviation of 34.29; for Metera, the 
-mean age of four children is 154.58 weeks with a standard deviation 
of 17.32 weeks; and for the 16 home-reared children at this level the 
mean age is 130.77 with a standard deviation of 47.1 1 weeks. Again, 
the children of the Municipal Orphanage are significantly older than 
either of the other groups, who do not dilTer significantly. It is also 
worth noting the standard deviations. The smallest is that for Metera 
where the conditions of rearing are most nearly alike; next for the 
Municipal Orphanage where the child-caretaker ratio of 10/1 almost 
insures the choice of pets while others are neglected ; but the standard 
deviatipn is largest by about 13 weeks for the home-reared children 
(Paraskevopoulos & Hunt, 1971). Presumably variations within fami- 
lies combine with differences in heredity to exaggerate the variance in 
the development of object permanence. 

Mpre directly relevant to the issue of the degree to which the class 
differences commonly observed are biologically inevitable is the evi- 
dence from two longitudinal studies made tor quite other purposes. 
In one, a still unpublished study by Ina Uzgiris, the scales were ad- 
ministered every other week to 12 infants from middle-class homes. 
The other, also a still unpublished study by Schickedanz and myself, 
was made to evaluate the effects of a mothers' training program (Bad- 
ger, 1971, 1972) on the development of the children of poverty being 
reared by parents participating in the Parent and phild Center of Mt. 
Carmel, Illinois. Here again, the scales were administered consecu- 
tively every other week to eight infants of these parents from the 
poverty sector who were participating in the program of this Center. 
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, Let us consider the means and standard deviations of the ages at 
which these two samples of children achieved the same two levels of 
object permanence. In Figure I, the results for these two samples 
appear in the two bars on the right of each cluster. For following a 
desired object through one hidden displacement, note again the mid- 
dle cluster. The mean age/or the children of middle-class homes was 
58.46 weeks (S. D. = 7.43 weeks) and that of the infants of parents of 
poverty who were participating in the mother's training program was 
48.06 weeks(S. D. = 9.22 weeks). All eight had achieved this level by one 
year of age, and the youngest by 44 weeks of age. For the top level of 
object permanence, following a desired object through a series of 
three hidden displacements in a reverse order, the mean age for those 
of middle-class was 91.36 weeks (S. D. = 9.43) and for the children of 
poverty in the Parent and Child Center was 72.74 weeks (S. D. = 
13.99). Apparently this mothers' training program has hastened the 
development of object permanence. It is often assumed that the child- 
rearing of middle-class families approximates the optimum, but here is 
an instance in which the child-rearing of the parent caretakers from the 
poverty sector has been improved by a program of mother training to 
surpass that of the middle-class— at least for object permanence dur- 
ing the first two years of infancy. If support for the Parent and Child 
Center program is continued long enough, we shall be able to follow 
the development of these children and also their performance into 
school to see how their performance there compares with that of their 
older siblings who lived through their early years before their mothers 
were touched by Mrs. Badger's program of mother-training. We des- 
perately need the kind of evidence one can get only from such pro- 
longed longitudinal studies of children developing under differing 
environmental circumstances. 

Only class differences have been conc^^rned in these comparisons for 
object permanence. All of the children in both samples were white. 
But evidence of plasticity in phenotypic measures of intelligence comes 
also for black children from a demonstration underway in Milwaukee 
which is directed by Heber and Garber (1972). of the University of 
Wisconsin. According Ji^ a progress report (Heber, 1970) to the 
agency supporting their work. Heber and Garber started their 
investigation with a survey of tested intelligence in that census 
tract of Milwaukee with the lowest socioeconomic status. From this 
survey came the interesting finding that approximately 80 percent of 
those children with iqs below 80 came from families where the mothers 
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had IQS below 80. Since nearly all of the people in this survey were 
black, this finding fit well the hypothesis, of biologically inevitable dit- 
; fcrences, but these investigators did not stop at this point. Rather, they 
selected a sample of 40 high-risk mothers with new infants. These 
mothers were identified objectively by IQS of 75 or below. For 20 ol 
these families, a home visitor saw and played with each infant until the 
infan. was approximately si.\ months old. then Hebcr and Garber 
arranged to have the infant brought five days a week to a day-care 
center. There, each was cared for by a woman who had been selected 
for articulate speech and who had been trained to provide appropriate 
educational experiences for infants. Each infant was also put on a 
program of repeated testing, and the mother was given counseling in 
homemaking and in the buying and preparation of food. For the 
other 20 faniilies. the program was limited to routine counseling visits 
0 with the mothers, and what I bclie\e wab a duplication of the program 
of repeated tcbting for their infants. At age 45 months, the iqs of the 
children of the control families a\erage approximately 90. This is a 
whole standard deviation above that of their mothers, an unusual 
degree of increase which probably derives in part from the repeated 
testing as well as from the expected regression eficct. At this same age 
of 45 months, the iqs of those children provided with educational day- 
care are reported to average an almost unbelievable 128. Unless there 
is something very wrong with this demonstration that I cannot now 
see it provides spectacular evidence of plasticity in a standard phcno- 
typic measure of intelligence within black children from families of 
the highest risk where mothers have los near the low end of the scale. 
Heber has been properly cautious about attributing permanence to 
such a gain. Moreover. I suspect considerable loss is inevitable if these 
children are simply returned to their families and to their neighbor- 
hood schools. Tests of intelligence measu:e only past achievements. 
They say very little of the future unless the circumstances ot luturc 
development are- specified. 

MATURATION AN'O EXPF.RIENCF. 

In our traditional conception of development, maturation and learn- 
ing have represented completely separate processes. Maturation has been 

considered to be controlled by the genes. Learning has been conceived 
to be controlled by environmental encounters. In this third quarter ot 
the 20th century, however, clear evidence has come that intormationa! 
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interaction through the eyes influences maturation within the central 
nervous system. For the most part, these investigations have been 
inspired by the neuropsychological theorizing of Donald Hebb (1949) 
and the neurobiochemical theorizing of Helgar Hyden (1959). Ap- 
parently inspired by Hebb's theorizing, Austin Riesen reared chim- ' 
panzees in the dark. The dark-rearing resulted not only in behavioral 
deficiencies but also in a diminution of the number of nerve cells and 
glial fibers developing within their retinal ganglia by adulthood 
(Chow, Riesen, & Newell, 1957). Corroboratively, Brattgard (1952), 
inspired by Myden's biochemical theory of memory, reported that 
rearing rabbits in the dark results in a paucity of^RNA production in 
their retinal ganglia as adults. Since then, a California group (Ben- 
nett, Diamond, Krech, & Rosenzweig, 1964: Krech, Rosenzweig, & 
Bennett, 1966) has reported that thickness of the cerebral cortex and 
the level of total acetylcholinesterase activity of the cortex, a^ well as 
rate of adult maze-learning, are the function of the complexity of the 
environment during eirly life. Quite recently, studies of the effects of 
dark-rearing during early life have been extended through the visual 
system. Wiesel and Hubel (195"^), for instance, have demonstrated 
that dark-rearing produced a paucity of both cells and glial fibers in 
the lateral geniculate body of the thalamus, and a Spanish investiga- 
tor, Valverde, and his collaborators, have obtained evidence that dark- 
rearing also decreases both dendritic branching and the number of 
spines which develop on dendritic processes of the large apical cells of 
the striate area in the occipital lobes in mice (Valverde, 1967, 1968; 
Valverde & Estebdn. 1968). Evidence that dark-rearing diminishes 
higher-order dendritic branching in cats as well as mice has been re- 
ported by Coleman and Riesen (1968). Evidence suggesting that such 
effects on dendritic branchmg and spine density may be a matter of 
the complexity of the information encountered and the variety of 
adaptations called forth has come from studies by Hollow ay (1966) 
and Schapiro and Vukovich (1970). Most recently, Fred Volkmar and 
William Grcenough (1972) have compared the dendritic branching of 
stellate and pyramidal neurones. Golgi stained, in several layers of the 
occipital cortex for litter-mate rats reared in a complex environment 
(where a group of 12 pupo were housed in a large wire-mesh cage 
provided with a variety of toys that were changed daily), in environ- 
ments consisting of pairs of animals in standard laboratory cages or 
single animals in such cages. Those reared in the complex environment 
exhibited considerably greater branching of dendrites of the third 
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order on. They showed seven times as many fifth-order branches on 
the pyramidal cells of the Layer V than did their litter-mates reared in 
isolation and about 3.5 times more than their litter-mates icared in 
social pairs. Since light was available for all, it would appear to be 
complexity of experience rather than mere absence or presence of 
light which is responsible for these very substantial differences. Not 
only do such findings show a great deal of environmental plasticity in 
the neuroanatomical maturation; they also suggest that the variety of 
informational inputs from circumstances of greater complexity call 
forth accommodq^tive adaptions which show in neuroau^tomy as weM 
as in behavior. If one considers 80 percent of the variance in pheno- 
typic measures of intelligence and related matters to be relevant to 
such evidences of plasticity in behavior and neuroanatomical matura- 
tion, then all this is highly puzzling, especially so since a partitioning 
of the variance in this study showed more than half of it related to 
environmental conditions (Volkiiiar & Grcenough, personal com- 
munication). 



The Norm of Reaction 

Let us consider how much variations in environmental conditions 
appear from existing evidence to be able to alter phenotypic measures 
of intelligence. 

Ever since Waltcrec introduced the concept in 1909 (see Dunn, 
1965), geneticists have concerned themselves with the ^^norm of re- 
action" or the "range of reaction" as well as with Mendelian statistics 
and the mechanisms of genetic transmission. The '^norm of reaction'' 
is defined as the range of phenotypic reactions which a specified 
genotype is able to produce in response to environmental influences 
(Rieger, Michaelis, & Green, 1968, p. 372). Such a concept, like that 
ofheritability, can never be fully specified from empirical data because 
a new investigator with imagination can always arrange a new pro- 
gression of environmental encounters which may alter further the 
range of phenotypic reactions. In investigative practice, however, one 
obtains jclevant evidence in terms of the difi'erence between the means 
of phenotypic meaMires of a given trait for samples of individuals from 
a given population of genotypes who are reared in difi'erent environ- 
ments. For the complex trait of intelligence and scholastic ability, one 
can get evidence from comparing the mean values of phenotypic mea- 
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sures for individuals from a given population reared under differing 
environmental circumstances or differing educational programs. 

Such data are still very few. In investigations with human subjects, 
inoreover, they are seldonr based strictly on a single population of 
genotypes. Nevertheless, highly suggestive evidence exists, and one set 
comes from the study of the ages at which children achieve the various 
levels of object perman.;nce— perhaps the most basic of epistemologi- 
cal achievements. Here the Athenian, children in the study by Para- 
skevopoulos and Hunt (1971) comprise about as close an approxima- 
tion of a single population of genotypes as one can expect to get. 
They represent the lower half of the socioeconomic structure. The 
American children in the still unpublished Worcester study by Uzgiris 
represent the middle-class, but those from the Parent and Child 
Center at Mt. Carmel by Schickedanz and Hunt represent families of 
the lowest socioeconomic status who were recruited from those on 
Welfare and on Aid to Families with Dependent Children, For fol- 
lowing a desired object through a series of hidden displacements in 
reverse order, the extreme mean ages are 206,58 weeks, for the children 
of the Municipal Orphanage where the child-caretaker ratio is 10/1, 
and 72.74 weeks for the children of poor families reared witb the aid 
of the Badger Mother^s Training Program in the Parent an\) Child 
Center at Mt. Carmel. For the Worcester children from middle-class 
families, the mean is 91.36 weeks with a standard deviation of 9.43 
weeks. Even though I have been developing ordinal scales in part to 
escape Stern's iqs ratio and the normative approach to a meaning for 
test performances and scores (Hunt & Kirk, 1971), it may be wortli- 
while here for purposes of communication to cast thc^c figures into 
this familiar ratio by assuming that 91.36 weeks for the children of 
middle class approximates the norm of 100. Thus, this empirical evi- 
dence indicates that the range of reaction must extend at least from a 
low of about 50 to a high of about 125 -a range of reaction of 75 
points of IQ.- 

One may well object that such a range could be found for only a 
simple function during infancy when the longitudinal validity of mea- 
sures of intellectual development is low. But this empirical difference 



^hc lower limit of this range deserves a word of explanation. Dividing ^;i..^6 weeks 
b) 206.58 weeks >ields less than 5, hut since the mean of 206.58 weeks derives from 
a cross-sectional stud>. it must exaggerate the iiel.t> more than would a longitudinal 
approach with examinations everv other week, f or this reason. I have rounded the 
lower limit to the approximation of an iq of 50. 
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of 75 points is essentially the same as that found by Wayne Dennis 
(1966) when he had the Goodenough Draw-A-Man Test given to 
samples of typical children, aged between six and nine year, who were 
living in typical family environments in some 50 cultures over the 
world. The variation in the means of such IQS ranged from a low of 52 
to a high of 124, a range of 72^IQ points. The variation in this pheno- 
typic measure appears to be c\ssociated with the degree of contact with 
and participation in representative graphic art. Probably this Draw- 
A-Man Test calls for a considerably less complex set of abilities, as 
these are assessed by factor analysis, than either of the more standard 
scales: the Stanford-Binet test'or the Wechsler-Bellevue Children's 
Scale, Yet, for American children, the iQs from the Draw-A-Man Test 
correspond about as well as do IQS from either of these other two mea- 
sures with each other. It must also be admitted that children in a cross- 
cultural study cannot come from a single population of genotypes, yet 
typical individuals from the Syrian nomadic tribe have shown the 
capacity to adapt themselves to technological cultures when reared 
in them. 

These-empirically determined ranges of 72 and 75 points in mean 
!QS fall only about one standard deviation short of the full range of 
individual iQs (between 55 and 145) which includes all but a fraction of 
a percent of individuals above the pathological group which bulges at 
the low end of the distribution for the iq. Clearly, there is dissonance 
between any argument based on the statistics of heritability and this 
argument from ranges of reaction. 



Relevance: Heritability versus Norm of Reaction 

Perhaps this dissonance can be clarified by the respective meanings 
of heritability and the norm of reaction. Heritability is, by definition, 
that proportion of the total phenotypic variance for a particular 
characteristic in a specified population (Rieger, Michaclis, & Green, 
1968, p. 212). Heritability is not an attribute of a trait, but rather of a 
trait in a specified population developing and living within a given set 
of environmental conditions. What a given estimate of heritability 
gives is the amount of gain or loss to be expected in the course of 
selective breeding! Thus, given heritability for a particular trait in a 
given population^at 80 percent, if a sample of parents is selected to 
have a mid-parent measure of the trait averaging one standard dcvia- 
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tion above the mean for the population, then the mean of the measures 
for that trait in the offspring would be expected to average .8 of a 
standard deviation above (or belolv) the mean of the population. This 
expectation would hold, however, if and only if the environmental 
circumstances remained constantjthrough the.lives.of the two genera- 
tions. Thus, an index of heritability tells us about how much of the 
selection advantage or disadvantage is lost between parents and their 
offspring. | 

On the other hand, an index of heritability can tell us nothing about 
how much change in the measures of a phenotypic trait will result 
from being reared in new environments.^ It can tell us nothing 
about how much the iqs of the children from a given population of 
genotypes will be changed through being reared in newly designed 
environments and educational programs. Thus, a composite herit- 
ability index of 80 percent for the iq may say how much of the, variance 
in IQ is hereditary for the .xinds of children studied who have 
developed under the existing environmental conditions of American 
and European culture, yet it tells us nothing of how much the IQ 
might be changed by newly designed systems of child-rearing and 
education. It is not relevant to why Project Head Start succeeded or 
failed. Knowledge of how much the IQ can be altered by new regime^ 
of child-rearing and early education can be obtained only from evi- 
dences of the range of reaction or from studies of the differences be- 
tween the means of iq for children reared in differing environmental 
conditions and with differing programs of early education. 

In a dynamic and developing society, the conditions of child-rearing 
and education arc always changing, hopefully improving. Measures of 
phenotypic intelligence woijld be expected to go up with these sup- 
posed improvements unless |the increases in intelligence are hidden in 
comparative scores based on new norms. They do. In a number of re- 
peat studies, increases in avt;-age IQ have occurred rather than the loss 
predicted by Cattell (l937)i'from differential fertility (see Hunt, 1961). 
Cattell (1950) himself published ne of these based on test surveys of 
the children in an Englislrcity in 1936 and 1939. Instead of the pre- 
dicted drop of one point, ihe found a gain in mean iq of 2.28 points. 



'The dissonance between the pidenCes for a composite index of the order of 75 
percent or 80 percent for her)tabiht> of the iq and the evidences of plasticity in 
development has long pu^zled^ me. I amgreatl> indebted to untings of Jerr> Hirsch 
(1970, 1972) and to discussions with him for this clarif>ing intcrpreiation of this 
dissonance. 
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Other studies have reported substantially larger gains in mean iq. One 
by Smith (1942), based on surveys of the children aged 10 io IJ^ in the 
schools of Honolulu in 1924 and in 1938, reported a gain in mean lO 
of 20 points. Another by Wheeler (1942), based on tests, made 10 
years apart, of samples of students aged 10 to 12 years from given 
families in the schools of the Tennessee Valley before and after the 
changes introduced by the Tennessee Valley Authority, reported a 10- 
point gain in median iq. Another by Finch (1946), based on tests given 
to all of the children in a sample of high schools in Minnesota during 
the 1920s and again in the 1940s, reported gains in means iqs ranging 
from 10 10 15 points. In yet another such study, Tuddenham (1948) 
compared a representative sample of draftees from World War II 
with a sample from World War I on comparable forms of the Army 
Alpha Group Test. The median for World War II fell at the 82nd 
percentile of the distribution of World War I. Thus, half of the draftees 
of World War II belong aiDong the upper fifth for World War 1 (see 
Hunt, 1961, p. 337n*). So long as one considers a high index of herita- 
bility relevant to educability, such gains would seem incredible, but 
indices of heritability are not relevant. 



Implications for Class and Race Differences 

The implications of these considerations for class and race diflerences 
become readily apparent when one considers the inequalities of en- 
vironmental opportunity across the class structure of American and 
European societies. Attempts to assess the genotypic potential behind 
phenotypic measures of intclhgence have always assumed essentially 
equal environmental opportunit> for growth, adaptation, and learn- 
ing with m.'cro-inequalities randomly distributed. The past two dec- 
ades, however, have yielded abundant evidence of large deficiencies in 
the development-fostering quality of the environments provided for 
children of lower-class families of poverty whatever their ethnic origin 
and race, inasmuch as a major share of Indian, Mexican, Puerto 
Rican, and black families fall in the poverty sector, their children 
share the poverty-based deficiencies of poor white families plus what- 
ever additional disadvantages are associated with dark skins and dif- 
ferences in language. 

These deficiences I have reviewed elsewhere in some detail (Hunt, 
1969, pp. 202-214). They include basic nutritional deficiencies in a sub- 
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Stantial share of mothers at the time of conception and during g<*station . 
They include a lack of opportunities to acquire cognitive and linguistic 
skills illustrated by such facts as the following: that where approxi- 
mately 90 percent of nursery school children can respond by picking 
up appropriate blocks for the colors named by an examiner and ap- 
proximately 80 percent can name all of the six colors used, only ap- 
proximately 20 percent of four-year-olds beginning a Head Start 
program can respond in these fashions (Kirk & Hunt, in preparation). 
These deficiencies also include a lack of opportunities to develop the 
motivational systems required for confidence apd persistent striving 
and also the opportunities to acquire those values and standards of 
conduct demanded by the mainstream of a complex organized society. 
These are not small^variations in environmental opportunity, and cer- 
tainly they are not randomly distributed across the class structure. 

Given the evidences of plasticity indicative of a range of reaction 
for measures of phenotypic intelligence of the order of 75 points, and 
given these class and race differences associated with poverty in the 
development-fostering quality of the environments, and especially the 
early environments provided for children, relatively small portions of 
the commonly reported deficits in the means of IQ and measures of 
scholastic performance for the children of unskilled parents— white, 
black, Indian, Mexican, and Puerto Rican-can be considered to be 
biologically inevitable. To be sure, the eviden'ce and the argument 
summarized does not rule out ? contribution from heredity to these 
differences. Yet if the Mother's Training Program of Earladeen Badger 
can bring the average iq of even a small sample of families of poverty 
approximately 25 points above that for middle class families (as 
assessed by the scale of object permanence), and if the Heber-Garber 
program can bring the average Stanford-Binet IQ for children of black 
mothers with IQS of 75 or below up to 128 at age 45 months, it becomes 
hard to believe that more than a very minor, share of the differences 
among class and ethnic groups are biologically inevitable. 

Recently, this case against the biological inevitability of class and 
race differences has received empirical support from another direc- 
tion. In a study reported at the Washington meetings of the American 
Psychological Association, George W. Mayeske (1971) reported a 
special analysis of the data in the report on Equality of Educational 
Opportunity by Coleman and others (1966) designed to ascertain the 
degree to which that 25 percent of the variance in scholastic achieve- 
ment associated with racial and ethnic group membership could be 
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explained in terms of socioeconomic and ediicattondLcircinnstanLes. 
From the relevant partial regression equations, he took int^ av^oujit 
the socioeconomic status of each famil>, the presence or absence of 
key members of each faniil), the assessments of the aspirations for 
schooling by students and parents, parental beliefs about how stu- 
dents might benefit from education, their region and neighborhood of 
residence, and the achievement and motivations of the students at- 
tending the school. When this was done, the variance among the stu- 
dents in their academic achievement scores associated with ethnic- 
group membership dropped to 1.2 percent. Smiilar findings have been 
reported by Jane Mercer (1971) for Chicano and black children in the 
schools of Riverside, California. The Uiore the families of these chil- 
dren resembled those of the modal configuration foi the middle-class, 
white community of Riverside in terms of five characteristics, the 
more nearly did the mean iqs of the children approximate 100. In 
^ these studies, both the partial regression equations used by Mayeske 
and Mercer's modal characteristics of middle-class families contain 
potentially genetic variance, but the force of such a consideration is 
reduced by the evidence of iQs well above average for children of poor 
while families and of black mothers with low igs when those children 
are provided with experiences which foster their psychological de- 
velopment. One ^vould guess from such combinations of evidence that 
all but a very minor share of children of poverty or of unfavored 
ethnic and racial groups could be reared in a fashion wh«ch would per- 
mit them to perform quite adequately in our technological culture if 
the economy provided the opportunity. Moreoser, many of those 
now typically fated for relative incompetence might well, with more 
fortunate rearing, achieve excellence along one of the diverse avenues 
of achievement in our societv. 



Recapitulation and Challenge 

Let me recapitulate and then present what I see as the challenge. 
Significant deficiencies exist in the means of lower-class and certain 
racial groups for many measures of ability, motivation, and perfor- 
mance. Most of the evidence, however, concerns measures of intelli- 
gence and scholastic performance. Composite attempts to estimate the 
proportion of phenotypic variance in iq which is genetic approximate 
80 percent. The.se heritability indices have been interpreted to mean 
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(hat most of the observed deficiencies in the mean iQs for classes and 
races arc biologically inevitable. These interpretations are in puzzling 
dissonance with evidences of plasticity which suggest that the range of 
reaction in the IQ must be of the order of 75 points. Such evidence of 
plasticity becomes less puzzling when one recognizes that estimates of 
hcritability indicate the loss of deviation from the mean of the popula- 
tion to be expected ^rom parents to offspring in experiments on selec- 
tive breeding, that indices of hcritability are relevant only to the status 
quo within Tgiven population so long ^.environment remains con- 
stant, and that these indices say nothing about how much the mean of 
a phenotypic measure of intelligence or scholastic achievement will 
change with development in new environments. Such information de- 
mands knowledge of the norm of reaction whic'h comes only from the 
difference between the means of measures of achievement and intelli- 
gence for groups of children from a given population of genotypes 
who have been reared under differing environmental circumstances or 
differing educational programs. Even though such evidence is sparse, 
that from two sources indicates that this' range is of the order of 75 
points and that special child rearing can boost the mean achievement 
for white children of poverty and for black children from mothers 
with IQS of 75 or below well above the population average. 

If measures of hcritability say nothing about educability, then the 
measures we have are irrelevant to the outcome of Head Start, But 
several factors help to explain why Head Start is said to have failed. 
I have elaborated these elsewhere (Hunt, 1969, Ch. 5). First, the goals 
were unrealistic, and, in terms of a broader view of social change. 
Head Start appears to have haJ considerable success, but in ways that 
differ from those unrealistic goals. Second, our basic understanding of 
psychological development and how to foster it was inadequate to the 
task. This explains in considerable part the unrealism of attempting 
to overcome the deficit deriving from four years of experience in a 
summer or a year of compensatory education without altering the 
milieu. Third, the nature of the nui^ery-schooling available for deploy- 
ment in the cash program of Head Start was poorly adapted for the 
compensatory effect called for by the goals. 

In one sense, the evidence outlined here may be vioved as opti- 
mistic, perhaps too optimistic. It is one thing to say that most of the 
class and race differences now evident are not biologically inevitable, 
and it is quite another to say that reducing the deficits associated with 
poverty is easy. We lack basic knowledge of early intellectual and 
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motivational development. Onl\ reeenily have we begun to lake seri- 
ously ilic hierarchical conception of sueh development and begun to 
describe the natural landmarks of achievement. An inilial approxima- 
tion of such landmarks exists now only for sensorimotor development 
(Uzgiris & Hunt, 1973) and for lingiiistie development during the 
preschool years (Brown, 1973). We know exceedingly little of what 
kinds of experience foster these successive landmarks and how each 
one is built upon earlier achievements. Even though we have a few 
instances where curricular suggestions have worked be'tter than typical 
middle-class rearing, no suggestion wc can make now is more than a 
hypothesis to be tested. Finding ways to get parents of poverty to 
irfilize innovations in early education is another problem. From the 
experience of the Parent and Child Centers, it becomes clear that we 
have not even begun to evaluate programs in terms of the determi- 
nants of their success in eliciting parent cooperation and participation 
and to examine the factors responsible for success or failure. For 
many-groups of parents any implied inferiority of their practices is 
hard to take, and resentment hampers cooperation with the program. 
Discovering ways of harnessing class, neighborhood, and ethnic pride 
to get parental cooperation in the improvement of early education 
aloiig with discovering the kinds of experience which foster early intel- 
lectual and motivational development are the basic challenges for the 
behavioral and social sciences and professions concerned in early 
childhood education. 



Questions and Answers 

Q: To what age does plasticity continue? Can twelve-year-olds 
catch up? Or, does the work >ou reported mean that trying to help 
high school students is useless? 

A. Although the evidence appears to indicate that plasticity de- 
creases considerably with age and with the diminishing rate of ana- 
tomical maturation, at least some degree of plasticity continues even 
to senility or perhaps death. Insofar as a person my age can have his 
theoretical views modified by evidence or even learn a set of nonsense 
syllables, a degree of plasticity remains intact. At age twelve, the 
investment of effort required for a major modification of abilities or 
motivational system appears to be substantially greater than it would 
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be at ages four or five, and greater yet than it would be at birth, pThis 
diminution of plastieityoippcars to be due not only to diminution in 
the rate of maturation.xbut also to the increasing levels of abiliiy al- 
ready acquired, and to the amount of information in the storage. In 
terms of the hierarchical conception of the achievements included in 
what we measure with tests of intelligence, individuals who fall be- 
hind have typically failed to develop abilities, conceptions, ami moti- 
vational systems that enable them to process new information and 
cope with new situations. / 

Now, is attempting to help high school children useless? Of course 
it is'uot useless. On the other hand, it is likely to require more jnvcst- 
mcijt of time and effort to achieve that miniuiuni of competence and 
motivation required to participate in the mainstream of ourjcomplex 
technological culture than would compensatory education at ages 
four and five. Moreover, such compensatory education can be ex- 
pected to call for more investment in this sense than would be re- 
quired for infants and young children from birth on. Modifying the 
child-rearing of parents, however, is not without its own ilifliculties. 

; Q: To what extent does nutrition affect brain-cell gi-owth and, 
therefore, intelligence? Can nutritional differences accQunt for ob- 
served class-related differences? / 

A: Rcporis-by Cravioto. by MacDonald. and by Pas^manick have 
reported that the incidence of dietary deficiencies and Jhronic health 
problems is about four oc five times as high among ihe/families of the 
poor as among families witfi average or higher incomes The literature 
on the association between neural maturation and /nutritional de- 
ficiencies is growing. Unfortunately. I do not know this literature well 
enough to be able to synopsizc it with confidence. I /know better the 
literature on the role of informational interaction in neural matura- 
tion, and I have synopsized some of this. From what/l do know about 
the literature, it would appear that nutritional fa:tori arc among those 
associated with class-related Jifferences in intelligence as now mea- 
sured. 



Q: What evident - exisis that performance on object permanence is 
related to more widely used measures of intelligence? 

A; Nancy Bayley has included performances oa object permanence 
in her infant scales. Piagetians and cognitive theorists very contmonly 



Heredity and Environment 



believe that the ontogenesis of object construction is not easily modi- 
fied by experience. Our data show that the range of reaction for the 
ages at which infants and toddlers living under diflcring environments 
achieve the highest level on our scale of 14 steps in object construction 
extend? Trom sixteen and a fraction months to approximately 45 
months. Since object construction is among the most "intellectual" of 
the performances used in any infant scales, where motor behavior is 
usually used, it should be one of the best indicators of intellectual 
achievement. I know, however, of no correlational studies of the rela- 
tionship of object construction to the more widely used tests of intelli- 
gence. On the other hand» later indices of intellectual development 
from Piaget have shown high correlations with results from standard 
tests of intelligence. 

Q: Do you know of any evidence bearing upon the improvement in 
the IQ of high si.s children through enrichment programs? Might not 
class differences be maintained if these are also "plastic?** 

A: It is generally assumed. that the child-rearing of middle class 
families approximates the optimum. If this should be the case, chil- 
^r-- r-^Mie class families should be developing about as rapidly 
and as fully as their genotypes permit. I doubt seriously if this is the 
case. On the other hand* their developmental rates probably more 
nearly achieve t^^eir genotypic limits t han do the developmental rates 
of the cliiIcIrerroT poverty. i'lTcTriscussion of this matter is amplified 
in my written paper. 

Q: Will these miraculously raised iQs in infants and young children 
' endure at older age levels? 

A: This is a serious question. From the standpoint of theory, how- 
ever, one would expect these raised levels to endure only if the cir- 
cumstances fostering the elevation of the IQ at the early ages persisted 
through the later ages and stages of development. The traditionally 
accepted constancy of the iq appears to be a function of at least three 
quite different factors: first, the genotypic variations in learning po- 
tential traditionally assumed, second, the progressively decreasing 
part-whole ratios involved in test-rctest consistencies with age: and 
third, the consistency in the development-fostering quality of the 
circumstances encountered within most families and most neighbor- 
hoods, Leon Yarrow has found measures of characteristics in familial 
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environments based upon two three-hour time-samplings accounting 
for between 20 and 25 percent of the variance in measures of perfor- 
mance on the Bayley scales at six months. This finding suggests that 
the consistency in the development-fostering quality of environments 
within families and neighborhoods is far greater than we have imag- 
ined, and far more important for the observed constancy in the IQ than 
we have ever believed. As a converse corollary, the evidences of plas- 
ticity indicated by the available evidences of the range of reaction 
tend to confirm this importance. 

Q: In view of your discussion of mother-child-iQ relationships and 
your reference to the work of Heber at the University of Wisconsin, 
what do you think of the recent court decision in Iowa where children 
were removed from one home because of parental incompetence, the 
ascertainment of which was based on a low iq? 

A: The case in question is unknown to me, I would contend that 
there are more relevant indicators of competence for child-rearing 
than an \q. In Puerto Rico, for instance, Dr, Albizu-Miranda has re- 
ported instances of children no more than seven or eight years old 
having higher mental ages than their parents, yet those parents were 
managing their lives adequately. On the other hand, if the Heber find- 
ing that a very high proportion of children with intellectual and moti- 
vational defects come from a relatively small number of families 
where the parents are intellectually or motivationally incompetent 
turns out to be reproducible, then his other finding that educational 
day-care for the children of such parents can bring their competence 
up to or beyond average suggests that we may have to supplement 
what parents can be taught and what they can do for their young in 
these relatively few cases. 
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This paper is in the nature of a progress report. We arc currently in the 
process of preparing a new edition of the 1966 book. The Dexelopment 
of Sex Differences, and arc rcvic\ving*all the relevant literature we can 
locate on the topic of this paper as part of the process of updating the 
annotated bibliography for that book. Our work is not yet completed, 
and at this time we can only say what some of the trends appear to be. 
I would like to begin by discussing whcthcr*data available during the 
past seven years would cause us to reconsider any of the generaliza- 
tions that we and others had conic to in 1966. 

First, the performance of the two sexes on measures of total, or 
composite abilities, such as ig tests: it is still a reliable generalization 
that there are no sex dilTercnces on these tests. The question itself is 
not a very interesting one, however. As we all know, boys are better 
at certain kinds ofjtenis and girls at others, and the sexes can be made 
to dilTcr on one direction or the other, or to be the same, according to 
the choice of a particular mix of items for a test. Since time is limited 
today. 1 would prefer to mo\c directly to the uifTerences in com- 
ponents. 

^Revised version of a paper b> the same name in i/w l)ru7iipm(7irvf6t^Xt^^ 
Stanford Universit) Press. I%6. • 1^66 b> the Hoard of Trustees of the Leiand 
Stanford Junior Univcrsit). 
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Differences In Verbal Ability 

Recent research continues to find female superiority on a range of 
verbal tasks. However, it may be that \se need to reconsider some of 
our views concerning the course of development of this difference 
with age. It has been thought that verbal differences begin very early— 
from the time of the utterance of the first word or even earlier, in 
babbling, but that the sexes become more alike on verbal skills as they 
grow older. If we go back to the source of this generalization, the 1954 
McCarthy summary of studies of language development, we find that 
the differences reported tended to be small, and many, as McCarthy 
noted, Sverc not significant even on large samples. It was just that when 
there was a difference it favored girls, and the many studies taken to- 
gether added up to a significant trend. The same was true generally irt 
the studies done between the McCarthy study and our own 1966 re- 
view (Maccoby, 1966), although the study with the largest sample, 
Templin (1957), found no sex differences over the age range three to 
six. In fact, however, there has been affi'wst no work with children 
under twa and one-half or three of a normative sort, involving large 
and unspecialized samples of children, since the I9.^0s and 1940s. 
Work in the field of language development has been very intensive 
during the past fifteen >ears. and our understanding of this develop- 
ment has been very greatly enhanced, but the work has tended to be 
focused upon very small and rather highly selected groups of children. 
The fact is that we do not know whether there ha-^^ been a change in 
the relative standing of the two sexes at these early ages with respect 
to articulation, length of utterance, or early vocabulary. Small-scale 
recent studies, such as those of Reppucci (1970) with two-year-olda, 
and Roberts and Black (1972) with children of one and one-half to 
two, have found no sex differences. Beginning at the age of twt) and 
one-half, we do have some recent work with large samples. McCarthy 
and Kirk (1963) tested children ranging from two and one-half to 
nine to obtain norms for the Illinois Test of Psycholinguistic Abilities 
(itpa). They found no consistently significant sex differences. The only 
consistent trend across age levels was that boys were better at "visual 
decoding"- that is, at receptive naming when the stimulus was visual. 
A set of seven other recent stutljes involving children of pre-school age 
has found no sex differences on a variety of verbal tasks. A major 
exception is in the work done by lts with children from impoverished 
families. Here, girls are ahead on productive language, though a 
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difference is not found on **rcccptivc'\ or **passive** language—that 
is, on the understanding of words spoken by others. 

As we move into the next age range, the early school years, there arc 
again few differences. Brimer (1969), who gave vocabulary tests to 
very large samples indeed, found, in fact, highef average .scores for 
boys at each age from six through 1 1. M|5)st studies, however, includ- 
ing the lYPA norming sample mentioned above, detect no consistent 
sex differences, and these include tasks involving productive "fluency" 
as well as tests of understanding. The primary exception is found in 
the work of the Stanford Research Institute, with very large samples 
of disadvantaged children in Follow-Through programs from kinder- 
garten through the second grade. Here the girls clearly test higher in 
a variety of language skills including reading, vocabulary, and the 
understanding of relational terms. 

It is at about age 10 or 1 1 that girls begin to come into their own in 
Verbal performance. It is from this age through the high school and 
college years that we find them outscoring boys at a. variety of verbal 
skills. Sex differences are not found in every study; the findings seem 
to depend in part upon whether tests of general knoSvledge are called 
verbal tests— boys tend to do at least as well as girls cn'such tests, and 
in the ProjeetTalent sample, substantially better. But in tests of verbal 
power, girls above age 1 1 do better, and in some studies the difi'erence 
is fcirly large in absolute terms. We have expressed as a rough estimate 
that, during adolescence, girls score on the average about a quarier 
of a standard deviation higher than boys on verbal tasks. One longi- 
tudinal study (Droege, 1967) which tbilowed a large group of high 
school students from the ninth to the 12th grade found that the super- 
iority of girls on verbal tasks increased through this period. This study 
is especially interesting since iVj longitudinal design permitted a con- 
trol for differential dropout. We think it is important to be clear that 
we are not talking only about spelling, punctuation, and talkativeness. 
Included as well are considerably higher-level skills, such as compre- 
hension of complex- written text, quick understanding of complex 
logical relations expressed in verbal terms, and verbal creativit> of the 
sort that is measured by Guilford's tests of divergent thinking. 

We suggest that there are distinct phases in the development of 
verbal skills in the two sexc through the growth cycle. One occurs 
very early— before the age of three. We emphasize thai the studies 
documenting sex differences at this age are very old, and that we do 
not know that the same situation prevails today. If it does, the evi- 
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dence indicates that the differences lie exclusively in productive lan- 
guage, not receptive language, and the girls* advantage is short-lived. 
At about three the boys cati^h up, and in most population groups the 
two sexes perform very similarly until adolescence. The exceptions 
during this phase are found in populations of underprivileged chil- 
dren, where girls maintain an advantage to a later age. We suggest that 
boys' greater vulnerability to hazards of all sorts, mcluding those pre- 
vailing prenatally, means that the worse the prenatal and postnatal 
nutrition and medical eare prevailing in a population, the greater the 
sex difTerenee in early performanee will be and the higher the age to 
which the difTerenee will persist, because of the presence of larger 
numbers of low-scorinji boys who have suffered sonic sort of systemic 
^damage in the populations most at risk. We will return shortly to the 
^matter of variability and its possible causes; but now let us simply 
note that for large unseletted populations the situation seems to be 
one of very little sex difference in verbal skills from about three to I K 
with a new phase of differentiation oeeurring at adolescence. 



Math Ability 

The earliest measures of some aspect of quantitative ability begin at 
about age three with measures of number conservation, soon followed 
by enumeration. There appear to be ho sex differences in performanee 
m these tasks during the preschool years, nor in mastery of numerical 
operations and concepts during the early school year^, except in dis- 
advantaged populations. Here again the data from the large studies 
conducted with Head Start and Follow-Through children show the 
girls to be ahead. The majority of studies on more representative 
samples show no sex differences up to adoleseenee, but when differ- 
ences are found in the age range nine to i.\ they tend to favor boys. 
Alter this age, boys tend to move ahead, and the sex differences be- 
come somewhat more consistent from one study to another, though 
there IS great variation in the degree of male advantage that is re- 
ported. For example. Project Talent finds that boys* math scored are 
two-thirds of a sta^idard deviation better than girls* at the 12th grade, 
while Droege, also using thousands of eases, finds no significant sex 
difference in high school, a large Swedish study finds a difference of 
less than one-fifth standard deviation. It is not possible to estimate, at 
. this point, how large quantitative differences are likely to be. 
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Spatial Ability 

Spatial ability continues to be the area in wi.ich the strongest and 
most consistent sex diflercnccs arc found. But the superiority of boys 
emerges relatively late. In the tests of psychomotor skills administered 
during the first two years of life, the sexes do not difTcr on items which 
mig)it be thought to have a spatial component (form boards, for ex- 
ample). There are now a number of studies in which modified versions 
of the Embedded Figures Test have been given to children of preschool 
age. One set of studies finds no sex differences (Reppucci, 1970: 
Shipman, 1971; Sitkei & Meyers, 1969; Eckert, 1970; Lewis et al., 
1968). Coatcs (1972) found preschool girls to be superior on embedded 
figures, and Corey (1970), working with large numbers of children in 
kindergarten and first grade, found girls to be superior on "geometric 
design'* while boys did better on mazes. Sex dilTerences remain mini- 
mal and inconsistent until approximately the age of 10 or II, when the 
superiority of boys becomes consistent on a wide range of populations 
and tests. 

Before, WL consider some possible explanations of the trends we 
have described, let us consider .vhat is known eoneerning variability 
in the performance of the two sexes at successive ages. 



Variability 

The question of whether bo>s are more variable in their intellectual 
performance than girls has been with us for a long time and still is not 
solved. The question was raised initially in Termairs work, when he 
identified more boys as *'gifted (Term an, 1925)." The excess of boys 
having iqs over 140 was found on a test where there were no sex dif- 
ferences in the means of large samples, hence it appeared that boys 
must be more variable, including more of both unusually high scorers 
and unusually low scorers. As Miles and Ternian both noted in the 
1954 edition of the Carmichael Manual (Miles, 1954), the method )f 
selcctioi. of cases for the Terman stud> made uiterpretation of the sex 
ratios diflTicult. The initial screening for location of high-scoring chil- 
dren was done with children who were nominated by then teachers as 
being especially bright, pluH sonic additional children who volunteered 
for the testing. We know that girls tend to underestimate their own 
intellectual abilities more than boys do, and so there is danger of sex 
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bias in testing self-selected groups. Both Miles (1954) and Terman and 
Tyler (1954) reviewed a number of studies to find out whether there 
was a concentration of either se\ among the \ery high seorer> on tests 
of mental abilities. They concluded that there appeared to be no con- 
sistent tendency toward a higher incidence of giftedness among boys, 
and that the sex ratios in the gifted range depended on the content of 
the test. 

Considering the mean se\ difTcrences reported earlier on \erbaK 
spatial, and mathematical tests, it should come as no surprise that 
there would be a higher incidence of %er> high-scoring boys on tests 
which emphasize content in which bovs, as a group, do better. In 
sonic recent tests to identify children with extremely high math and 
science abilities at the junior high school icvcL Julian Stanley and his 
associates have located considcrabK more boys than girls with these 
talents. Presumably, if one looked for the exceptionally high scorers 
on verbal tests, one would find more girK. Of course such results do 
not necessarily nu-an that one sex is more variable than the other- the 
two distributions could have equal standard deviations but with the 
distribution of one sex simply displaced upward, yielding more cases 
above any arbitrary cutting point. 

At the lower end of the ability scale, studies of the incidence of 
learning deficits consis»cnily indicate that there arc more bovs than 
girls who sufTcr from such deficits. The greater vulnerability of the 
male child to anomalies of prenatal development, birth injury, and 
childhood disease is well known and needs no documentation h«!rc 
For our present purposes let us simplv say that this viilncrabiliiy 
probably doesalVect the mcidencc of verv low NeOrcson tests of mental 
abilities. In Nchool systems, ehildrcn with sueh seorcs lend to be si- 
phoned off into elasscs for the educationally handicapped, and in 
most psychometric work done with school children, these classrooms 
are not included. Their inclusion would, of course, increase the vari- 
ability of boys* scores by adding more scores at the low end of the 
distribution. When these eases arc excluded, however, the distribu- 
tions of the two sexes on a variety of tests are remarkably similar. We 
have plotted standard deviations for the two sexes, by age. on tests of 
verbal, spatial, and quantitative abilities. Up to adolescence, we find 
no tendency for either v^x to have larger standard deviations After 
age 1 1, boys* *.tandard deviations tend to be about five to six percent 
higher than the girls*. Our tabulations arc not eomplctcd. hut our con- 
clusion at present is that there is very little sex ditfcrencc in variability 
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prior to adulthood. The patterns of abilities which occur among the 
gifted members of the two sexes ma> be expected to differ somewhat, 
but there appears to be no basis for expecting a diflerence in the num- 
ber of boys and girls who have very high over-all intellectual power. 
That men have historically achieved greater eminence in science, 
literature, and the arts we do not doubt. What we do doubt is that this 
difference is rooted in a greater incidence of very high intellectual 
potential during the adolescent years. 



4 

Possible Origins of Intellectual Sex Differences- 
It is time now to consider possible causes of the sex differences thai, 
have been outlined briefly above. We will begin by examining possible 
genetic components, and will consider how genetic factors might be 
carried and expressed during development. Then we will turn to ex- 
periential— environmerttal— factors. 

We all know some of the standard research on heritability of the iq. 
Composite intelligence does seem to have a substantial heritability 
factor. When we turn to the question of inheritance of specific abili- 
ties, particularly those which may be se.\-lmked. the genetics become 
more complex, and we find little research that is di recti) relevant. We 
have been able to locate four studies of parent-child resemblances in 
spatial ability (Stafford, 1961 : Corah, 1965: Hartlage, 1970: Bock & 
Kolakowski, 1973), All show significant cross-sex correlations. That 
is, boys' scores on tests of spatial abilities are correlated with their 
mothers' scores but not their fathers'. Girls* scores are correlated with 
their fathers* but less with their mothers'. Stafl'ord*s hypothesis is that 
at least one important genetic determiner for spatial ability is sex- 
linked, being carried on the X eliromosome, and being recessive. 
Girls, with their two X chromosomes, would have a relatively low 
chance of receiving two recessives, which would have to exist for the 
trait to be manifest. Whenever boys got a recessive X, it would be 
manifest, since there would be no donjinant X to suppress it. Further- 
more, boys would aLwa>s get their X from their mothers, not their 
fathers. The magnitudes of the correlations that have been found in 
the three studies do fit the Stafford hypothesis reasonably well. It 
should be noted that these parent-chikl studies virtually rule out the 
possibility that spatial ability is acquired through same-sex modeling. 
The modeling hypothesis is implausible at the out.set, because of the 
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fact that there is not much about an inchyidual's spatial thinking that 
is open to the observation of others. What prectscl> is a child to copy 
if he is to learn spatial thinking b> tnittation? The o\crt responses in- 
volved in, say, the solution of a Gottschaldt figures problem tell very 
little .about the thinking that led to the solution. Despite this problem, 
it has been suggested that since men on the average have higher space 
ability, boys must acquire it by selective imitatton of their fathers 
rather than their mothers. The studies of parent-child resemblance 
show that children tn a given famtly are more likely to resemble the 
cross-se.\ parent, and that if boys have htgher space abtlity thts occurs 
despite their simtlarity to their mothers, not because of any matching 
to their fathers' scores. 

It appears likely, then, that there is at least ^ome degree of genetic 
control over spatial abtlity. We have not yet located comparable 
figures for /erbal abtltttcs. so \\e do not know whether girls' verbal 
superiortty has a stmtlar basis But where docs o'ur genettc knowledge 
about spatial ability leave us? Knowing that there is a genetic com- 
ponent in this ability docs not help much in understanding the process 
and events that arc invoKcd in the development of spatial thinking. 
Does the space gene control the rate of development of certain parts 
of the brain? Does the space gene operate through the mediation of 
sex hormones? 

An intriguing notion has recently been ad\anced,bv Kimura (1%7), 
by Levy-Agresti and Sperry (1%8). and others, that space ability is 
related to the degree of lateralization between the two hemispheres of 
the brain. As it bears upon sc\ dilVerences. one Ime of reasoning is as 
follows: m adults, spatial abilitv tends to be localized in the right 
henusphere of the brain, verbal functioning m the left In early child- 
hood, the two hemispheres of the brain are not greatly specialized tii 
function, and lateralization tends to occur over a period of growth. 
Girls are on a faster developmental timetable than boys. Hence 
hemispheric dominance for some functions tends to be established 
earlier for girls. Uvidcnec for this eonies from work by Kimura using 
a dichotic listening technKjUc. If dilVercnt verbal messages are pre- 
sented to the two ears, the adult listener tends to hear what came to 
the right ear. since this message goe> mainly to the left side of the 
brain where speech is louilized In testing middle-class sample chil- 
dren aged five through eight. K hUira f'ound this speech lateralization 
well established in children of both sexes through the whole age range 
tested. But Uiter work with economically less, advantaged children 
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showed that five-year-old girls had established left-hemisphere 
specialization for speech while five-ycar-okl boys had not (Kiniura, 
1967). And there is evidence that in bo>s with reading disabilities, the 
lag in lateralization is even greater (Taylor, 1962). Another instance of 
sex differences in rate'of development of cerebral dominance is pro- 
vided by Ghent (1961), who reports that dilTerential, touch sensitivity 
between the preferred hand and the non-preferred hand normally has 
developed in girls by age six and in boys it is delayed till age II. 

So far we seem to have a possible explanation of the boys' early lag 
in verbal' functioning and their catch-up !n middle childhood, wit'i the 
lag continuing longer in children from disadvantaged populatioi s. If 
we can assume lateralization to be an advantage for most intellectual 
functions, however, the rate-of-niaturation hypothesis would help to 
explain male deficits early in life, but they would not explain female 
deficits. And the greatest problem with the hypothesis is that, as we 
have seen, there is really ver> little sex difTerence in any of the major 
component abilities during early and middle childhood. Sex differ- 
ences emerge strongly after the age of II, when lateralization is pre- 
sumably as complete as it is going to be in both sexes. It is possible, 
of course (as Sperry has suggested) that once left-hemisphere domi- 
nance has beew established it tends to inhibit the deCelopment of 
functions that would normall> be speei^alizing in the minor hemi- 
sphere. Thus, the bo>s* delay in establishment of hemispheric domi- 
nance might give them time for visual-spiltial functions to d'e\cIop 
strongly. However, they do ultimatcl> develop lcft*hcmisphcrc domi- 
nance for speech functions, so the question remains wh> does male 
spatial development show the greatest spurt during adolescence'? Of 
course, there may be delayed etVects. To find out whether early lan- 
guage development in any wa> shuts olV the development of spatial 
ability, what is needed, we think, is examination Mithin sex, of the 
relationship between the rate of carl> language development and later 
levels of spatial abilit> . if boys w ho talk late have better spatial abilit> 
in later childhood than boys who talk carlv, this would be good evi- 
dence for the inhibiting etlccts of carl> Icft-hcmisplierc dominance on 
right hemisphere functions. 

In the absence of this kind of data, wc arc prctt> much in the dark 
as to whether there is an carl> critical period for spatial ability when 
the absence of lateralization is especially important/ 

Levy-Agrcsti and Sperry present a dilVcrcnt line of rc'asoning with 
respect to brain dominance. Thev argue that lateralization is less 
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strong in women, and in fact tliat women and left-handed men share 
a number of problems related to weak hemisphere dominance, par- 
ticularly deficits in the visual-spatial sphere. The Levy-Agresti and 
Sperry hypothesis might explain the large and increasing superiority 
of males on spatial tasks, but it will not explain the large and increas- 
ing female superiority in verbal and conceptual thought. We find the 
Levy-Agresti and Sperry hypotheses and the Kimura hypothesis in- 
compatible. That is, one cannot explain male spatial ability as due to 
stronger lateralization in males, and then attempt to explain female 
verbal superiority as due to stronger (or earlier) lateralization in 
females. Both things are currently being said. 

Another problem that fosters skepticism about the brain-lateraliza- 
tion explanation of sex dilTerences is the nature of the cluster of skills 
controlled by the two hemispheres. The right hemisphere controls 
spatial visualization, at which girls are worse, but it also controls fine 
perceptual-motor coordination, at which girls are better; the left 
hemisphere controls language fluency and reading, at which girls tend 
to be better; but it also controls other elements where girls have no 
advantage. Levy-Agresti (1968) has described the left hemisphere as 
follows: it is verbal, sequential, detailed, analytic, and computer-like. 
This does not sound like the traditionally feminine package of skills- 
or even like a psychological package \ve can associate with any group! 

We are prepared to bcheve thai ditTerential brain lateralization may 
eventually turn out to be related to the diilercnt patterns of abilities 
the two sexes develop. But precisely \vhat the relationships are seem 
very unclear at the moment, and \ve must simply consider it an open 
question. 

Let us turn now to a dilTercnt mechanism through which a geneti- 
cally-controlled sex ditTcrence nught work. We refer to the action of 
male and female hormones. We might begin with the well-known 
study by Ehrhardt and Money (1967), in which a group of ten girls 
were located whose mothers had been given an androgenic hormone 
during pregnancy. The ten girls had all been physically masculinized 
to some degree. Their social behavior and interests showed masculine 
tendencies as well. For our present purposes, the point of major in- 
terest is that the girls at age 13-14 had an average IQ of 125. Uhrhardt 
and Money note that this is above the national norm of 100. It should 
be noted, however, that these girls eame from families with higher than 
average education. Six out of nine of the fathers had college education. 
One does not know precisely \v hat level of IQ to expect from daughters 
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of untreated mothers in families at this education level, but it is safe to 
say that the average would be above 100. More interesting is the work 
by Dalton (1968), in which pregnant women were given dosages of 
progesterone, a female hormone. Follow-up work has been done with 
both the male and female offspring of these women. The children have 
significantly higher IQ scores than matched controls, and this is true 
of both sons and daughters born following progesterone-treated preg- 
nancies. It appears that either male or female hormones can serve to 
promote whatever aspects of prenatal growth relate to ultimate intel- 
lectual strength.' But the work so far does not help us to understand 
the development of different patterns of abilities in the two sexes. 

There are similar problems with the work of Broverman and his 
colleagues (Broverman, 1964: Broverman et al., 1968: Klaiber et al., 
1971). Broverman and others classify abilities in a different way than 
along spatial or verbal dimensions: they claim that males are better 
at tasks that call for the inhibition of previously-learned responses, 
and those that call for restructuring, while females are better at simple, 
overlearned tasks. In reviews of this work Parlee (1972) and Kagan 
and Kogen (1970) raised questions about the validity of this classifica- 
tion, and we will not repeat their criticisms here except to say that we 
believe they are cogent. For our present purposes, the point of greatest 
interest is the effects of experimentally administered sex hormones on 
performance in certain selected tasks. Unfortunately, the researchers 
used only male subjects and used only serial subtraction as a task for 
measuring the effects of the hormone administration. If serial sub- 
traction can qualify as a simple overlearned task, then it would be 
included with the group of feminine skills. When testosterone was 
administered, performance on this task declined less, over an interval 
of time, than when no testosterone was administered, suggesting that 
the male hormone facilitated performance on this feminine task. This 
finding is consistent with the Broverman view that male and female 
hormones act in the same direction -that is, toward feminization of 
intellectual performance - but that female hormones are more power- 
ful. Unfortunately the experimental design Joes not permit a test of 
this view, since female hormones were not administered, and no task 
requiring inhibition of a previously learned response was used. Thus 
we cannot compare the effects of the two kinds of hormones on what 
Broverman considers to be typically masculine and typically feminine 
tasks. It would be highly useful, in research of this kind, if both male 
and female subjects were used, since there might easily be an interac- 
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tion between the individiiars hormonal history and the cflects of an 
added hormone. 

At this point, we must simply say thai consider the v.ork on the 
relation between sex hormones and intellectual performance to be 
totally inconclusive. 

Having encountered a good deal of frustration in oui efforts to 
understand the possible biological factors underlying sex difTerences 
in intellectual performance, let us no\s briefly consider the social- 
emotional and experiential factors. Remembering that the major sex 
difTerences emerge after the age of I L \\e are inclined to give a good 
deal of weight to the social-emotional infjuences that impinge pri- 
marily at adolescence. We believe it is true that many girls do develop 
what Horner has called the^^vjH to fair* at this age. Patterns of male 
dominance and female acquiescence probabl> also become especially 
strong in the interactions between the two sexes at this age, and bright 
girls do begin to hesitate to compete with boys, or to take opposing 
positions in their presence. The problem is'simpl\ this: we can del>:ct 
a number of factors which might have a generally inhibiting eflect on 
the intellectual growth of one sex or the other. What we have to ex- 
plain, however, is an increasing superiority of girls on verbal tasks, 
and an increasing superiority of bo>s in spatial tasks and possibly 
some quantitative tasks as \selL, Wha* social factors improve intellec- 
* tual performance of one kind while interfering with other kinds'.^ We 
are hard put to guess what those factors might be. We have not been 
able to locate any solid research which relates any sort of social pres- 
sure or parental sociali/ation practices in adolescence to specific pat- 
terns of abilities. Of course wc ma> be dealing with delayed-action 
cfTects; it might be that there is something about the ways boys and 
girls are treated in early childhood that predispose them to certain 
patterns of inte(lectual development in adolescence. Witkin. for ex- 
ample, has argued that maternal behavior which fosters independence 
in young boys will be associated with a bo>\ developing a high level 
of "perceptual dilTerentiation*' a characteristic which has a high 
loading on spatial ability as it is ordinarily measured. This might help 
to explain the later emergence of high spatial ability in boys if we could 
document that they arc granted greater independence during child- 
hood. A recent survey we have done on dilTercntial parent treatment 
of the two sexes during the first six years of life (Maccoby. 1972) has 
indicated that the two sexes arc about equally restricted during this 
period-girls are allowed as much independence as boys, on the aver- 
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age. We also have reason to doubt ^hat there are clear difTerences in 
dependency between the two sexes during these early years, at least 
as this variable is usually measured. Furthermore, mothers talk to 
their young sons as much as they do to their young daughters. The 
notion that girls* superiority in verbal development is rooted in greater 
early childhood dependency, or greater amounts of verbal stimulation 
or reinforcement, simply cannot be documented fr Mn the evidenee we 
have at present on parent-child interaction. What 'about the social 
milieu in which girls and boys. live during middle ehildhood and late 
adolescence? Here there may be, and probably are, some diflTerenees 
in how much freedom of movement and freedom from surveillance is 
allowed the two sexes» but we lack evidenee. There can be no doubt 
that boys spend more time on sports, and that girls are more interested 
in other kinds of social mteraction. But let us think seriously about the 
kinds of tasks that girls Jo well on during the high sehool years; tests 
of verbal abilities at this age include verbal analogies, selection of 
precise opposites for relational terms, and logical problem solving. Is 
a girl going to be able to solve an item on the Miller Analogies Test, 
or make a eorreet deduction from a premise in a syllogism item, be- 
cause she has chattered with her girl friends after school instead of 
playing ball? Is she going to be better at diagramming sentences 
simply because she has uttered more sentences during the day? We are 
profoundly skeptical about hypotheses of thi^ kind. We suspect that 
the girls who spend most time chatting about nonschool-related 
matters are not going to be the girls who will get the highest scores on 
tests of abstract verbal ability, and that girls' greater interest in social 
affairs is not the explanation of their superiority on such tests. We 
encounter the same kind of problem when wc attempt to explain boys* 
spatial superiority in term*^ of their greater interest in sports, or other 
activities that would mean more moving about through space. It seems 
unlikely to us on the face of it that boys will be helped in developing 
spatial ability by throwing a basketball through a hoop \\h\]c girls are 
not helped by threading a neeille yet this is about the level of much 
current speculation on the subject. Lmpirically. not much is known 
concerning the characteristics of high-space boys, but Witkin's work 
suggests that they tend to be rather quiet and bookish. In a study of 
difTerential abilities done at Stanford some years ago (Ferguson & 
Maccoby, 1966). we found that high-space boys were significantly 
lower in aggression, and signilkantly more withdrawn, than low -space 
boys. Thus the kinds of interpersonal activities thac boys do engage in 
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more frequently than girls, such as rough and tumble play, do not 
seem to foster the development of the ability in which they most excel: 
spatial ability. In fact, it would appear that they have good spatial 
ability in spite of their patterns of daily activities, not because of 
them. 

We feel we should apologize for having given you a recital of what 
we do not know about the origins of intellectual sex difTcrences. We 
would like to have been able to be more positive. But perhaps divesting 
ourselves of some misconceptions may not be a bad wdy to begin the 
complex task of understanding the factors that underlie sex difTcrences 
in intellectual functioning. 



Questions and Answers 

Q: Is there any information on the possible cfl'ccts of emotional and 
attitudinal differences betsseen the sexes on subsequent de\elopment 
of sex differences in specific abilities? 

A: There have been efforts to link some of the well-doeumciilcd 
emotional sex differences, .such as the gi cater aggressivejiess of the 
male, to specific intellectual abilities. There is a psychoanalytic hy- 
pothesis, for example, that matiiematics fasors males because it is es- 
sentially aggressive, that is, in such mathematical analysis as that 
involved in algebra, for example, one must eontinually destroy existing 
statements and substitute ucw ones. To call such mathematical opera- 
tions ^'aggrcssuc" seems to us inherently implausible, but if one were 
to take the hypothesis seriously, one \\o{ik\ ha\e to account for male 
superiority on m>//-aggrcssivc (and hence feminine*') aspects of mathe- 
matics such as integral calculus. 

There is little doubt that the sexes do differ from an early age in 
certain temperamental qualities, boys are more eomi*ctiti\c, girls 
more conforming. Furthermore, the massisc input of sex hormones 
at the beginning of adolescence docs change the internal emotional 
chmate for both sexes, a fact which might well have implications for 
their intellectual development during this period. There is very little 
research on what the implications are, however. In speculating about 
promising possibilities, we recommend taking a cue from the work of 
Schachtcr. He found that an injection of a given hormone would serve 
as a general arouscr, but that the nature of the emotional state which 

50 

6-3 



EI««nor Maccoby 



was experienced and expressed depended on aspects of the external 
situation. In a similar vein, we regard it as likely that both male and 
female hormones function as arouscrs. The two kinds of hormones 
may have sex-specific effects, of course, on the behavior of adolescent 
boys and girls toward members of the opposite sex. When it comes to 
their effects on intellectual interests and performance, however, we 
suggest that an influx of sex hormones might make the individual 
either more effective or self-defeating, and strengthen tendencies in 
either an anti-intellectual or a pro-intellectual dirc^ction, depending 
upon the individual's social milieu and his habitual modes of dealing 
with increased activation. In short we believe that the effects of sex 
hormones on intellectual performance, though potentially powerful, 
will prove to be susceptible to channeling through social influence. 

Q: What are the implications of early sex-typing in interests and 
activity preferences for the later development of specific intellectual 
abilities? 

A: There is good evidence for sex-typing in toy preferences during 
the preschool years, and for considerable differences in the extra- 
curricular interests of boys and girls during middle childhood. We 
have commented in our paper that we do not see boys' interest in 
sports as directly linked to spatial or mathematical ability. Boys' 
interests in mechanical gadgets of all kinds, houever, may be impor- 
tant precursors of these abilities. To our knowledge, the longitudinal 
evidence to support or refute this hypothesis does not exist. There is 
some recent evidence that girls at about the age of 10 begin to believe 
that studying math and science will have little relevance to their 
future work or life styles, while boys do see these studies as relevant. 
The emergence in early adolescence of views about the probable na- 
ture of their future lives may have a good deal to do with the rapid set 
differentiation m intellectual interests and skills that occurs at this age. 
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My topic concerns tlu- sources of indiMciual and group diircrcnccs m 
test scores for cognUi\c iriiilb .md llic iniplicalionb of these dillcreiicos 
for tost imcrprcuilion. Aciuallv the topie eoi.ld be broadened to 
include any qiiantitiable inforinati(Mi ^oneerning human abilities or 
achieNonients. ineluding lntervle^^s. letters of recommendation, and 
other ratings. ^Mth little if any moditieatUMi of the discussion. The 
problem gcMieriiily is one o\ dra^vlng inferences about an individual 
from any'source of information or combination of sources, \sheiher 
the information be categoncal or eontmuous Nevertheless. 1 shall use 
•■test" for convenience, but please remember the br.Kider context and 
the broader implications of my discussion 

Mv di>cussion also centers on normatively scaled tests administered 
for piirpo.es of revealing individual ditTcrences in perf.Hmance. Fail- 
ure to discuss such tests as mastery . criterion refereiKcd. and diagnos- 
tic does not mean thai these .nher types o( tests are unimp.utant To 
the extent that iiood tests ot these latter types arc available, they 
should ccrtainlv be u.ed to su,.,Mcn...;;. h::t .o supplant, the mlor- 
mation obtained from normative tests of abilities and achievements. 

There has been, historically, eonsiderable concern about the "lair- 
ncss" of the inferences drawn trom test sc.ues [or individuals Uoiu 
certain suhnroups in our populatio., twenty years a-., interest cen- 
tered (Ml lower-class children r.Hlay the foeiis is on ethnic or racial 
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minorities, both children and adults. Fairness, however, is a word 
laden with overtones of past and current wrongs, discrimination, and 
so on, and I shall for the tintt being substitute accuracy of inference 
for fairness of inference. 

The accuracy of an inference is assessed along two classical dimen- 
sions: the amount of random error present and the amount of con- 
stant error present. Criterion measures, like predictor information, 
can be either continuous or categorical, and the method of assessment 
of the accuracy of the inference ditTers slightly in the two situations. 
For a continuous criterion, the standard error of estimate reflects 
random error; for a categorical criterion the analogue is the size of 
the difference between the conditional probabilities relating test score 
to the two or more categories. Constant error is assessed by the 
amount of oxer- or underestimation of the score on the continuous 
criterion, or of the o\er- or underestimation of the conditional proba- 
bilities for the categorical criterion, ab a function of the demographic 
group to which the individual belongs. 

When there is an independent, continuous criterion measure avail- 
able, these two dimensions of aceurac> of a test are translatable into 
the regression comparison desc/ibcd by Gulliksen and Wilks (1950) 
and popularized by Cleary (1968). The size of the standard error of 
estimate is inversely related to the slope of the regression line, and the 
assessment of constant error involves comparing the intercepts of 
parallel regression lines or of dilTcrences between nonparallel lines. 
Comparisons of conditional probabilities, including both difTerences 
among categories and among groups, are equivalent statistical opera- 
tions for independent categorical criterion measures. 

Without an independent criterion measure on individuals. wHilH is 
the usual situation in tii. use of aLademic achievcniL-nt tests, inference 
is made to a theoretical true score. .Since the true score is theoretical, 
it can alwa>s be considered continuously and normally distributed, 
in this situation, random error is directl> related to measurement 
error and constant error is determined b> jud*iing the degree of con- 
tent validit) of the test in two or more groups. In doing the latter it is 
right and proper to compare test content to the Lurriculum content 
to which the examinee j:roup iias been exposed. To LomiMfc tTic tcst 
to the content o/ the total life learning expericPLCs of the exannnee 
group subverts the purpose m giving an achievement test 

I am now read) to define the fjirness of an infcrenLC Loricermng an 
individual drawn from test scores, random error sla 'dd be minimized 
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within reasonably generous limitations of testing time, and the amount 
and direction of the constant error should be known and allowed for. 

The first aspect of fairness means that validity of the test, or- com- 
bination of tests, should be as high as possible within broad practical 
limitations. There is rarely any good excuse for making inferences 
affecting the lives of individuals on the basis of very short tests, or 
shortened forms of standard tests, if the inference is worth making, 
in fairness to the individual, it should be made on the basis of maxi- 
mum validity. When inference is made to a categorical criterion mea- 
sure, however, decisions can be made or advice given with high confi- 
dence about individuals who have extreme scores on the test. Use of a 
well-designed sequential strategy is acceptable, and the injunction 
concerning use of short tests is modified accordingly. 

Although we have not improved academic prediction much in the 
last 50 years, my first aspect of fairness is relatively clear-cut with re- 
spect to the research and statistical operations required. In contrast, 
the second aspect involves a host of problems. In the first place, any 
given individual belongs to a very large number of difterenl demo- 
graphic groups. Which ones ,should be selected for research"^ Or must 
we investigate all of them? If each demographic variable is divided 
into only two categories, and if there are as few as 10 such variables. 
1,024 separate groups are defined. 1 have some confidence that groups 
showing the largest test ditTerenLCs are most hkelv to show intercept 
difTerences with the consequent constant error ni the estimate of the 
criterion score for dn> given individual, but this is far from being an 
infallible guide. As a matter of fa.t, it is a far more accurate guide to 
the sources of discontent and of pohtical pressure to discontinue the 
use of tests than it is to the dimensions of test accuracy and the related 
aspects of fairness of inferences drawn from tests. 

A second source of problems with the constant jrror dimension is 
the number of examinees in the two or more groups being compared. 
Given a large enough N. any two groups will probably show a signifi- 
cant difTerence in intercepts This is based on the premise that there 
are no completely zero dillcrcnLCs in nature. There is a third source of 
difficulty also. If lor a large A' a single regression line provides, by 
chance, a very close fit to the separate regression lines for two or more 
groups difi*ering in mean score on the test, a change in the reliability 
of the test or the addition of a similar test in a composite (Linn and 
Werts, 1^71) will produce a ditVerence in intercepts. 

It appears obvious from the abo»e discussion that we cannot use 
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tests with perfect fairness as he^-ein defined. We simply do not have 
enough information, and the information required is all but unobtain- 
able. Nevertheless, there is a limited amount of information available 
from which we can try to assess the degree of unfairness that has re- 
sulted from the common practice of usmg^ single regression equation 
for all demographic groups. There is no reason to be distressed by the 
lack of a perfect fit between any mathematical model and the real 
world— such discrepaneies are ubiquitous, not rare— as long as the 
lack of fit is sufficiently small. 

The problems of male-female and black-white difTerenees have been 
most extensively studied. There are. in addition, studies of social-class 
differences and at least one oJequate study of regional differences. Re- 
ports on Spanish-speaking Americans and American Indians are 
scattered and inadequate. 

First, it is well to define what I mean by adequate researeh. It^Mhere 
,s an independent eontmuous criterion measure, the researeh must 
,nvolve a eomparison of the regression equations. The two character- 
istics of these equations that are critieal are. as noted earlier, the slopes 
and the intereepts. (The prelmimary assumption of equality of stan- 
dard errors of estimate can. within hmits. be violated. Inequality of 
standard errors of estimate that results from unequal slopes is the 
important phenomenon.) UnfortunateK. the liter, iture on test fairness 
has become confused- oiie might even say contaminated-by reports 
of research incorrectly analyzed. Inability to reject the statistical 
hypothesis of zero slope in a small sample ofa minority group does not 
constitute evidence for differential validity of the test for that group. 

When the literature reporting regression comparisons is summa- 
rized, the following conclusion seems warranted, there is relatively 
little difference in the slopes or intercepts of regression lines as a func- 
tion of the demographic groups that have been studied. Use of a 
single regression equation for these groups leads to no substantial 
degree of unfairness in drawing inferences concerning the crueria 
measured. The amount of error involved is generally less than the 
sampling errors of regie^sion coellitients based upon .Vs of the size 
typically found m validation studies for minority groups. 

If needed, this conclusion can be made more detailed. The small 
errors resulting from the use ofa single equation are of some theoreti- 
cal importance. Slopes of lines for males and females arc about equal, 
but there is a fairly consistent tendency for female performance on the 
criterion to be underpredicted by a small amount, that is. women tend 
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to perform at least as well as their test scores indicate. Slopes of regres- 
sion lines for black students, particularly for black males, when com- 
pared to white students may be a httle lower. There is also a fairly con- 
-sistent tendency for the constant error for blacks to be a small degree 
of overprediction; that is, they tend to perform <^t best at the level 
indicated by their test scores. 

As hypotheses to explain these discrepancies 1 suggest the following: 
a higher mean motivational^ or other important npncognitive, level 
for females, and a higher contribution to criterion vafiance of motiva- 
tional or other important noneognitive components \for black males. 

Note that the al?ove conclusions are not invariant laws of nature. If 
there is sufficient intervention in training, or if the criterion measures 
.differ for the groups compared (Bowers, 1970), a single regression 
equation is no longer a good approximation for both blacks and 
whites. It should also be noted that regression comparisons have not 
been made for criteria in early childhood education and have not been 
made for criteria removed in time by several years fronl^thc point of 
lest administration. Regression comparisons have bcen\made, how- 
ever, for appropriate criteria in education, industry, anduhe military 
with comparable results in all of these areas. All in all, thert are so few 
exceptions to the preceding conclusions that 1 feci no gr^^^at concern 
about the unfairness of inferences from tests when a single\regression 
equation is use J to predict practical criteria. \ 

The lack of empirical support for big driVerences in either intercepts 
or slopes of regression lines should not be surprising. There \^as never 
any good theoretical expectation that such difl'ercnces would Be found. 
The assumption of cultural deprivation docs not in itself constitute a 
theory. When one starts with that as a premise and develops tKc inter- 
vening steps in the reasoning in order to form a theory, the expecta- 
tion is that deprivation will depress the scores on both the predictor 
test and the criterion measure, but not the correlation .between the 
two. Approximate identity of regressions for minority and majority 
groups is the result anticipated. 

The only theoretical bases for a different expectation that have 
occurred* to mc will be instantly, and correctly, found unacceptable 
by almost everyone in this audience. One is appheable only to racial 
comparisons and goes as lollow s: Negroes and Caucasians culler belong 
to different species or the environmental differences have been so pro- 
found that the same psychological principles do not apply to both 
groups. Therefore, the organization of ability measures will differ 
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radically in the two groups. The second is hardly more plausible, even 
granting an antecedent in Spearman's **mental energy** concept of 
50-70 years ago. In this second ''theory/* intcMigence and other apti- 
tudes are considered to be potential forces or mental powers like water 
under pressure or like stored electrical energy. Then the hydrant or 
switch is finally found and uncovered, afler being buried by years of 
deprivation; intellectual power is available, fully developed, the minute 
the spigot is turned or the ^itch thrown. 

I do not know of any careful studies of the content validity of 
achievement tests for different demographic groups. The problem has 
become confused by dragging in the red herring of discrimination and 
early deprivation. There is no reason to believe that standardized 
achievement tests, selected by a school to fit its curriculum", are not 
fair measures, as I have defined the term, for different demographic 
groups. A high school senior who reads at the sixth grade level may 
have had his level of achievement depressed by early deprivation, but 
the fact remains that he does not read very we!!. If reading is impor- 
tant, as it certainly is, both in further academic work and in citizen- 
ship, it is important to knaw the size of the deficit without regard to 
its possible causes. 

In contrast to inferences concerning practical criterion measures, 
inferences are frequent!y drawn concerning capacity to !earn from 
aptitude and inte!!igence tests. !n this interpretation, these tests are 
sharply distinguished from tests of what has been learned. These 
hypothetica!, capacities are typica!!y interpreted as fixed or constant 
and are fre4uent!y assumed to be innate as we!l. Such inferences go 
far beyond the a\ai!ab!e evidence, or are even contradicted by good 
evidence, and are high!y unfair to a!! examinees, both chi!dren and 
adu!ts,.and members of both privMeged and underprivi!cged groups. 

In the first p!ace. if there are capacities to !earn, they are not mea- 
sured direct!y by any present psychological or educationa! test. The 
very act of testing requires that the examinee have acquired a reper- 
toire of !carned responses. Sccond!y, the functions measured by !ntc!!i- 
gence and aptitude tests are not stab!e in the indi vidua! over time. 
Intercorrelations among a series of tests of the same function admin- 
istered over a period of time invariaWy fa!! into a simp!ex pattern. 
Tcst-retest cori'clalions ovrr n tinu* span arc never as high as the 
respective re!iabi!ities, and continue to drop as time increases. Flna!!y. 
there is the issue of innateness of the function. If wc kne»v the correla- 
tion between phenotypeand genotype, we would estimate a score for 
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the genotype and attach to it a standard error, but we do not know 
that correlation. In place of knowledge we have widely varying esti- 
mates of the correlation with the variance of the estimates depending • 
more upon the assumptions of the person making the estimate than 
upon the size of his sampling error. If the correlation were known, 
however, estimation would proceed by regression methods in a fashion 
parallel to the estimation of true score on the test. 

There is some evidence that the size of the correlation between 
phenotype and genotype, although the level is uncertain, differs as a 
function of social class (Searr-Salapatek, 1971). It is not at all improb- 
able that this correlation varies more from one demographic group to 
another than do the correlations between test scores and practical 
criteria. For the present, at least, a test score has more meaning and 
can be used more fairly in making behavieiil predictions than in 
making inferences concerning theoretical constructs. This may even be 
true for the relatively modest construct of true score. 

Up to this point I have avoided the concept of the fairness of the use 
of tests for selection purposes, concentrating instead on the fairness of 
the inferences about individuals. Does this distinction have meaning? 
From the growing literature in this field, one can conclude that many 
persons would agree that it does. 

Thorndike (1971) has contributed a contrast of two definitions of 
fairness, ,a|)d. Darlington (1971) has furnished a broadej discussion, 
Darlington's exposition has recently been expanded by Cole (1972). 
Darlington presents three correlational definitions of fairness in selec- 
tion in which value judgments are somewhat hidden, a fourth which 
requires an explicit \aluejudgment for each criterion, and a fifth which 
is defined without reference to criteria. I shall discuss for the nlomenl 
only the first three and shall neglect the superficial fifth entirely. 

If the same qualifying score is to be used for members Of both 
groups, the following relationships are required if the test's use is to be 
considered fair. In these three definitions of fairness a represents the 
,st, y the criterion, and r the demographic group (coded 1 for ma- 
jority, 0 for minority). 

A) = — , derived from r,, , = .00 

B) r,, = Tw, *Ah;ch equate^ th«» nrnh;ihiliry of meeting mininuini 
criterion proficiency with the probability of qualifying on the test 

C) rxr = r,,r,^, derived from rx, ^ = .00 
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Definitions A and C above require that the slopes of the regression 
linesfor test and criterion in the two groups be essentially identical. If 
this assumption has to be discarded, different qualifying scores are 
necessary under definitions A and C; if the equalities have to l>e dis- 
carded, different qualifying scores are necessary under all three. When 
different qualifying scores are required, they are computed in accord- 
ance with the logic incorporated in the several equalities. 

These definitions have Very different properties and lead to very 
different consequences which both Darlington and Cole have dis- 
cussed. I shall describe a few properties, but before proceeding I shall 
translate Darlington's definitions into a slightly different form. This 
is don^;in Figure 1, <f 

Perhaps the first property to note is that Definition B represents the 
limit* of the correction for attenuation of the regressions portrayed' in 
A and C, but B itself is independent of the correlation between x and 
y. Secondly, there is no constant error 1n the prediction of criterion 
score from test in A, there is overprediction *he minority group in 
B, and still larger overprediction in C, Ho .-ver, this situation is 
exactly reversed for the regression of test score on criterion with no 
constant error for C, some for B, and still more for A, Definition C 
will qualify most minority group members, A least, and B is inter- 
mediate, but if level of performance on the criterion is important, or 
if elimination rate is a problem, A is preferred over B and the latter 
over C. 



• •mton of mofonty gfo^ O •m«ort Of m»>ority group 




Fig. 1. Regression illustrations uf throe definitions of fairness for the use for a single 
qualifying score in selection for majority and minority groups. 
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One dimension of difference among these definitions not discussed 
by either Darlington or Cole is the extent to which selection fairness 
as here defined can be sabotaged by either the selecting institution or 
the examinees. A lesser slope of the regression lines for the criterion 
and test for the minority group will qualify fewer minority members 
under A and more under C, An institution m.ight be motivated, and 
would be able, to reduce validity under A, while the examinees might 
be motivated, and would be able, to reduce validity under C. Defini- 
tion B puts pressure on both groups to maximize validity. However, 
policing of institutions to prevent this type of unfairness is much easier 
^ than policing examinees so thai the disadvantage of A is minimized. 
Which of these different definitions of fairness is fairest? Cole pre- 
sents C in an intuitively appealing manner, but intuitive appeal is a 
function of manner of presentation. When shown in regression form, 
the lack of bias in predicting test score from criterion measure, which 
is characteristic of C, clearly tacks appeal. For what purpose is predic- 
tion of test score desirable? Definition B has more properties that 
appeal to me than does C, although it suffers from an ailment common 
to all three: namely, the effects of measurement error. Measurement 
error \r\ the predictor is critical for A. in the criterion measure for C, 
and a differential amount m test and criterion for B. Criterion mea- 
sures, it should be noted, are typically less reliable than predictors. 
Definition A is, of course, the model for fairness for an individual, and 
the absence of sizeable constant errors in its use make it an attractive 
candidate for the preferred definition of fairness in selection as well. 

For the present, however. I am most fav^^rably inclined toward 
Darlington's fourth definition which m\ol\es making a value judg- 
ment for each criterion concerning the size of a bonus on the criterion 
for a particular minority group that is warranted by siKial and politi- 
cal considerations. Errors in prediction of both types are weighed 
against each other foi ?ach demographic group in determining the 
size of the bonus, and the logic of Definition A is used in selection but 
with the regression betweeir predictor and modified criterion replacing 
the one ordinarily used. 

Though I support Darlington's definition in which explicit value 
judgments are required, I do so reluctantly and with the hope that the 
need for a definition of this sort will pass. The present acute social 
problem which is the baMs for the need was brought on by our failure 
to treat and to evaluate each human being as an individual. It is my 
hope that we can shortly return to that ideal and make it work. In- 
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dividual differences are very real and very important. In a recent per- 
sonal letter to mc. Professor E. R. Hilgard summarized this matter 
about as follows: if individual differences did not exist, we woild have 
to assign them. 

It is possible, even probable, that the problem of constant errors as 
a function of group membership under Definition A would virtually 
disappear if we were only better psychologists. Linear composites of 
measures covering a wider gamut of important human characteristics 
than are now being measured could reduce present constant errors in 
prediction substantially. In contrast, increasing the reliability of pres- 
ent predictors would have very small effects, 1 am willing on the basis 
of a little data and more theory to substitute *'would" for "could'' in 
the statement concerning the effects of linear composites and thu5 con- 
vert a* true statistical statement into a psychological hypothesi^This 
hypothesis says that if we were able to predict more accurately for 
individuals in the majority group, we would need to be concerned less 
with the constant errors associated with group membership when pre- 
dicting across groups. This also means that a good deal of research in 
recent years has been misguided. In place of looking for uniquely 
black or white, male or female, or lower-class, middle-class predictors, 
we should have been looking for better human predictors. 

There is still another possible definition of fairness in selection that, 
to my knowledge, has not yet been suggested. This is to transform the 
selection problem into the classification problem and use the multiple 
discriminant function in place of multiple regression. The late Philip 
Rulon convinced me many years ago (see Humphreys. 1952) that dis- 
criminant thinking had a great deal of merit. Neither he nor I con- 
vinced a substantial number of others, however, and the use of the 
technique has not grown as 1 hoped and expected that it would. 

Note that Definitions B and C partake of the logic of the discrimi- 
nant in that both assume that meeting minimum standards on the cri- 
terion is sufficient: that is, an examinee either belongs in the criterion 
group or he does not. It might also be noted that college admissions 
officers utilize similar thinking, particularly with respect to the non- 
cognitive traits of entering students. They believe that a particular 
t}peof student will be best satisfied and will fit best in their particular 
institution. In many, many situations we could select in both industry 
and education on the basis of those traits that are found in currently 
successful, baiibficu wul^cl^ ufiu atudcp.ts. IPiStcad. each institution 
now tries to maximize the aptitude level of its selectees. More does not 
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always mean better. For those who would like lo look further at the 
mathematics of the discriminant problcnu Tatsuoka's new book 
(1971) is recommended. Incidentally, the book is dedicated to Philip 
J. Rulon. 

In closing it is legitimate to inquire concerning thr reasons for the 
attacks on psychological and educational tests. In part, tests are all too 
frequently given under poorly controlled and unstandardized condi- 
tions. In paft, also, the^attaeks are the result of the unfair jnferences 
concerning native capacities that have been described. In part, how- 
ever, the attacks are misdirected. A good test carefully administered 
furnishes good information which can be interpreted wiihout-regard lo 
the demographic membership of the examinee. Attacks that assume 
the reverse of the preceding statement not only cannot be documented 
by good research, but what is worse they misdirect attention from the 
primary problem ; namely, a large number of children are not learning 
to read or do arithmetic or know science and social studies very well. 
Furthermore, while membership in any demographic group does not 
constitute insurance against inadequate learning, a large proportion 
of these children are found amqng a small number of ethnic groups. 
Thus, a problem which is basically an individual diflferences problem 
psychologically and biologically becomes a problem in group differ- 
ences with important social and political consequences. 

Slow learning cannot be overcome by abolishing the devices that 
reveal it most clearly Any experienced teacher can go into inner city 
classrooms and reach a conclusion similar to the one reached by in- 
spection of test results. Ratings differ from tests in being a little less 
reliable, a little less valid, and a great deal more subject to constant 
error, but basically they tell the same story. 1 am forced to conclude 
that moves to abolish tests are more ostrich-like than human-like. The 
problem will simply not go away. 

Although the primary problem is slow learning in the schools, and 
elsewhere, the primary causes for this do not appear to reside in the 
schools. This is clearly true for black children on whom wuhave the 
most and best data. Although these children are much below average 
in achievement at the end of the 1 2th grade, they also show an approxi- 
mately equal relative deficit in the first grade. Thus the schools do not 
produce the deficit, but neither do they compensate for it. 

If the schools are not responsible for the initial deficit, where do the 
causes lie? There are numerous possible sources, but no one source 
can be tagged with confidence making a given precise contribution 
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to variance. There is good reason to believe, however, that the con- 
tribution of each is nonzero. The list starts with genetic differences 
and includes the totality of the environment from the de\eIopment 
and release of the ova and sperm through fertilization, the prenatal, 
perinatal, and postnatal environment, to the entrance of the child in 
school^ Environmental effects that are biological may be as irreversi- 
ble as the genetic constitution, and the effects of many early learnings 
are highly resistant to change. 

When looked at in this way, a solution that simply envisions new 
buildings, more books, smaller classes, more advanced degrees for 
teachers, that is, more money for education, is inadeqfiate. Just doing 
more of what has been done in the past is very unlikely to help ap- 
preciably. Within the limits of formal education the only hopes are 
radically new curricula and radical changes in instructional tech- 
niques. While there can be no guarantee that such changes will solve 
the problem, these are areas of much higher priority for resear<:h and 
development than are new intelligence and achievement tests. 

Relevance of curriculum content to life experiences seems highly 
desirable as is the gearing of this content to where children are in their 
intellectual development rather than wher^ we think they ought to be 
or where we would like iheiii to be. For example, standard English 
should be the terminal goal for the public schools, but to reach this 
goal it is probably necessary to compromise in the early grades. With 
respect to instructional techniques that differ radically from present 
methods, my ordering of the degree of promise places t»elf-paced, in- 
cluding computer-based, techniques teamed with adequate reinforce- 
ment provisions in first position. A change from conventional schools 
to the so-called free schools, however, would further depress already 
depressed academic performance. 

1 shall not try to suggest changes for the preschool conditions pre- 
viously summarhced. These problems, however, are the resp^ nsibility 
of both the larger society a//^ of the communities and families in which 
children are conceived, develop^ and learn. 



Questions and Answers 

Q: What do you consider a compromise to demanding English in 
the primary schools? 

(^7 
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A: The use of Spanish by the tcitcher in the primary grades; the iise 
of readers in ghetto Enghsh, but with the goal constantly in mind that 
we have standard English learned and used by the end of the public 
schooLperiod. 

Q: What are the "free'^ schools that further depress academic 
performance? 

A: By "free" schools I do not mean the well-organized English day 
schools or open classrooms. When well done, such classrooms come 
close to meeting my specifications for the preferred direction in educa- 
tion. 1 do refer to unorganized Jaissez-faire schools that have appeared 
in thi§ country. 1 have only a little data specific to these schools, but 
they do violate almost everything we know about learning. 

Q: A question about my conclusion that -doing more of the same" 

not going to be very helpful. 
' A: -My choice of words was unfortunate, although the context 
should have given the clue. " Providing" is more accurate than '\lo- 
ing." More money for education i^ not the answer. 1 recommend the 
new book by Jencks and others concerning the Jnfluenee of the 
schools on intellectual performance, occupation, income, and so on. 

Q: Is there any way of takiHg into .account the possibility of sys- 
tematic bias in. the measurement of criterion performance"' 

A: Criteria are of necessity cultuially bound: that is, criterion per- 
formance IS t>omething valueil b> the socielv. It should be measured 
objectively if at all possible. Ranngs used as criteria arc subject to 
bias which is dilTicult to control. 

Q: How did your research lead to your conclusion that- greater ac- 
curacy for individuals would lead to smaller .onstant errors'' 

A: 1 have no research, but there are some important general princi- 
ples. In the gduTpoob of blacks and whites there is almost 100 percent 
overlap in the genej> stemming from a single evolutionary origin for 
for the races of man. Hiere is also much, much more overlap m the 
environment for blacks and whites in this country than there are dif- 
ferences: schools, diet, TV,, radio, langiitige, newspapers, and the like. 
There is very great overlap on all psychological traits. ^ 
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Q: Black-white difTercnces have commonly been found to be m.ui- 
mal in grade one. How can you conclude that bchool differences are 
small? 

A: The premise is not true. The mean difference between blacks 
and whites in this country in standard score units is about the same 
in the first grade as in the 12th. Mental age or grade equivalent units 
become smaller with advancing age and movement through ^he grades. 
A one-unit deficft a| age six ii* the equivalent of a two-unit deficit at I?., 
a three-unit deficit at 18. ^^ 

Q: Are speeded tests **fair'*? 

A: It depends on wheth5:r speed increases or decreases validity. 

Q: What are the prospects for the new cortical response measures?' 

A: Nil for replicating the information we now obtain from an 
intelligence test; but something useful may develop. 

Q: Are there studies of interrelations of abilities as a function of 
demographic data? i 

A: Yes. Such studies date back to the early fifties in the Air Force. 
There are minimal differences m the organization of abilities as be- 
tween males and females, blacks and whites, or U)w and high sis 
groups. 



Q: How do you exclude broader life experiences and focus only on 
comnionality \n curriculum expcnenLes as mipDrtant influences on the 
organism*s development? 

A: I do not, but one gives an achievement test in order to find out 
how well a student reads or does arithmetic. There are many possible 
explanations for a deficit, but explaining the ileficit does not erase it. 

Q: Would you favor adjusting aptitude test scores on the basis of 
the mid-parents* educational kyel? 

A: Not at all. 1 favor maximizing the accuracy of predictions or 
inferences. As a general proLcdure this adjustment would decrease the 
accuracy of predictions. For certain predictive situati ms, this variable 
is i»:efuU but it should be studied inclependeiitly. 
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Q: Why do you use the term "slow learning"? 

A: In part. 1 u^e it a^ synonynuui?) \Mth beUnv average aeadeniie 
performance, but it aUo suggests a useful eoneept. If we eonsistently 
allowed time to vary and made certain that btudentb acquired the 
skills and knowledge required before moving on to now material, some 
below average performers might achieve higher levels of competence 
than they now do. 

Q: Is it true that mtclligenee test^ ovcrprcdict performance for 
blacks, because performance i> also predicted b> achivvcmcnt. and 
achievement is reduced relatively niore by an underprivileged back- 
ground than is intelligence? 

A: There is no qualitative ditTerencc between achievement and in- 
telligence tests. My hypothcMi is tliatr^)verprediction is the re>>ull of 
(a) errors of mea>>urenicnt and (b) failure to measure nnportant func- 
tions. 
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Biological 
Differences 



CH \RUS V. WlLLIL 

S) rci case Univors i ty 



I have entitled this presentation "A Theoretical Approach to Cultural 
and Biological DitT^jrences" because I intend to dra\\ m\ inferences 
more from the similar fi ndings of the scholars rather than setting forth 
some findings of my o\v n. Lest the audience become concerned, how- 
ever, that I am some kind of nuitational freak, I will not behave com- 
pletely out of character as a discussant. If the spirit should so move 
me, 1 will get in a few of my own substantive licks here and there and 
pretend that the> are elaborations on a theme initiated b> one of the 
authors. 

Essentially, J. McV. elunt, Lloyd ilumphreys. [Jeanor Maccoby 
and Carol Jacklin have foLUsed upon assumed biologital dilTerences 
or differences thought to be contributed to b\ heritabilitv more so 
than by environmental circumstances (Jensen, pp. 1-123). They 
have analyzed variations m such phenomena as intelligence , aptitude: 
achievement, verbal, mathcniatical, and spatial abihty by sex, race, 
ethnic, and sOLial-tlass categories. Wuni states that ''heredity is clearly 
primary'' with reference to intcIIigeiKc. But Humphreys warns us that 
"the functions measured by intelligence and aptitude tests ire not 
stable in the individual t>ver time." And Maccoby and Jacklin con- 
cluded 'Mt IS still a reliable generalization that there are no sc\ difler- 
ences on these [intelligence] tests/' From these statements we may 
determine that populations with obv.ously difl'crcnt hereditary charae- 
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teristics such as those associated with sex respond similarly to intelli- 
gence tests, and that a person whose hereditary characteristics are 
obviously the same at birth as in later years responds differently to 
intelligence tests at different periods in time. If under these conditions, 
if heredity is clearly primary, the question must be raised: primary 
for what? 

Ours is a discussion about population genetics. I thought it appro- 
priate to consult an outstanding scientist in this field, Theodosius 
Dobzhansky. The second edition of his excellent book. Genetics and 
the Origin of Species, was awarded a prize by the National Academy 
of Sciences (Dobzhansky, 1951). From Professor Dobzhansky's dis- 
cussion, I have extracted eleven principles which make specific con- 
tribution toward a theory of cultural and biological differences; 

1. A Mendelian population is . . a reproductive community of in- 
dividuals who share in a common gene pool [p 15] 

2. Gene frequencies and variances, rather than averages, characterize 
Mendelian populations. Ail Mendelian populations are pol>morph!c 
[pp. 108-109]. 

3. A species [is] polvmorphic if it contains a \anei> of genot>pes. each 
of which IS superior in adaptive value to the others in some habitats 
which occur regularly m the territory occupied by the species . . . 
[pp. 132-133] 

4. Polymorphic populations [are], in general, more efiicient in the 
exploitation of ecological opportunities of an environment than 
genetically uniform ones. . . [pp. I.'^2-13.'^]. 

5. Racial differences are more commonl> due to variations in the rela- 
tive frequencies of genes in different paiis of the species population 
than to tin absolute lack of genes in certain g.'^DUps . , [p 176]. 

6. Race and species are populations . which remain distinct only so 
long as some cause limits their interbreeding [p. 18]. 

7. The sum of genes of an individual or a population constitute the 
genotype. . . The resulting bodilv forms . . are different pheno-^ 
t>pes- . . A genot>{>e is poieniiallv able to engender a multitude of 
phenot>pes . . [pp. 20 21]. 

8. Any phenoi>pe that ma> formed is necessarilv a response of the 
environment to theactivitv of the genotvpe The genotvpq reproduces 
itself regardless of what phenoi>pe it happens to evoke in a given 
instance [pp 20 -22] 

9. Some genotypes permit ,i gaMler amplitude i)l modifications . , than 
others, and some traits are plastic while other . iire more rigidlv fixed 
[p. 23]. 
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10. Human intellectuality and emotional development is an example of a 
great plasticity and susceptibility to environmental influences. En- 
virorvnent, upbringing, schooling, association with other people and 
the manifold variations of individual biographies are powerful 
^ moulders of human personality. The genotypic determinants of hu- 
• man personality are easily obscured by the environmental ones 

tp^ 23]. 

11: The populations of most species vary, often within an enormous 
range, from generation to generation [p. 163]. 

It is appropriate to close this review of findings pertaining to popu- 
lation genetics by returning to Dobzhansky's discussion of the Mende- 
iian laws. According to Mendel, the fundamental units of racial vari- 
ability are populations and genes, not complexes of characters. Dob- 
zhansky goes on to say that: 

many studies of hybridization were made before Mendel, but they did ^ 
not lead to the discovery of Mendel's laws In retrospect [reports Dob- 
zhansky] we see cl&arly where the mistake la> : they treated as units the 
cofDplexes of characteristics of individuals, races, and species and at- 
tempted to find rules governing inheritance of such complexes. [DObzhan- 
sky states that] Mendel was the first to understand thai ... the inheri- 
tance of separate traits [and] not [the inheritance] of complexes of traits 
. . . had to be studied [Dobzhansky bemoaned the fact that] some of the 
students of racial variability consistently repeat the mistakes of MendePs 
predecessors [p. 177]. [That is. they try to traec inheritance through 
complexes of characteristics,] 

Dobzhansky concludes that "Race is not a static entity but a pro- 
cess. . . . Racial variability must be described in terms of the fre- 
quencies oRndividual genes ... in groups of individuals occupying 
definite7\habitats. Such a description is more adequate thuii the usual 
method of finding the abstract average phenotypes of 'races* . . . [pp. 
177-178]." 

Please note that this review has focused on population genetics and 
not the genetics of individuals. As stated by Dobzhansky. the rules 
governing the genetic structure of individuals are difiercnl from those 
governing the genetic structure of a population Moreover, he states 
that "every human individual is unique, different from all others who 
live or lived [pp. 15. 4].** 

Returning to the question posed earlier. 1 agree with Hunt that 
heredity is primary. It is primary for the continuation of the species; 
for only a genotype may reproduce a genotype. (As Hunt has said, 
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only an elephant can reproduce an elephant.) After that, much is left 
to the habitat and environment. On this point agree all of the authors 
as well as the population geneticists. For example, Hunt states that 
life under different circumstances can produce differences in a given 
genotype or a given population of genotypes. He makes specific refer- 
ence to children from poor families who have lived under conditions 
of only poverty. Humphreys states that children who are much below 
average^ in achievement may suffer the deficit from a number of pos- 
sible sources, starting with genetic differences and including the total- 
ity of the environment. Maccoby and Jacklin state that data on the 
incidence of specific deficits in learning abilities indicate that these 
occur considerably more frequently among boys than girls. However, 
they explain that the greater vulnerability of the male child to anom- 
alies or prenatal development, birth injury, and childhood disease is 
well known and that this vulnerability probably does affect the inci- 
dence of very low scores on tests of mtelleetual abilities. These state- 
ments are similar to those of Dobzhansky, that a phenotype is a re- 
sponse of the environment to the activity of the genotype and that the 
genotypic determinants of human personality are easily obscured by 
the environmental ones. Thus, the findings of Hunt, Humphreys, and 
Maccob> and Jacklin are in accord with a fine tradition of behavioral 
science theory which, in summary, states that a genotype is potentially 
able to engender a multitude of phenotypes. Apparently most of our 
tests have been- measuring genotypic responses to the environment, or 
phenotypes, which'accounts for the instability of such measuremenis 
on the same individual at differe.it periods in time. 

Dobihansky*s finding that 'Iniman intellectuality ... is an example 
of a great plasticity and susceptibility to eiuironmcnlal influences** is 
confirmed by the investigations of our authors and those of other 
scientists on whom they report. Hunt has shared with us the results 
of studies conducted by him and associates in the Parent and Child 
Center in Mt. Carmel, Illinois. He and his colleagues found that 
child-rearnig of pdrents from a lower class has been improved by a 
parent education program so that the behavior of their children in the 
development of object permanence surpasses that of the middle-class, 
atjeast during the first two years of infancy. These and other findings 
are illustrative of the great plasticity of intellectuality. Humphreys 
states that there is some evidence that the si/e of the correlation be- 
tween phenotype and genotype differs as a function of social class. 
This evidence may or ina> not indicate genotypic plasticity; but it 
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certainly docs point towarci phenotypic variability, possibly related to 
diflfereniial environmental circumstances of life. 

Maccoby and Jacklin inform us that variations in phcnotypcs, par- 
ticularly with reference to spatfal and verbal ability, may occur at dif- 
ferent age levels because of different contemporary and past experi- 
ences. While they hold strongly to the general statement that there 
appears to be no basis for saying there is a sex difference in overall 
intellectual power, mention is made of variiitions in the stages in 
which these specific skills develop.'' For example, an increasing su- 
periority of girls on verbal tasks and an increasing superiority of boys 
in spatial tasks and possibly some quantitative tasks have been ob- 
served. Thus far, Maccoby and Jacklin have not located any solid 
research which might explain social factors associated with these dif- 
ferent patterns of development. However, it is important to point out 
the need for longitudinal studies on patterns of sequential develop- 
ment. While the females may have a head start in verbal development 
^during adolescence, presumably the males play catch-up and do make 
creative contributions in arts and letters in adulthood. Indeed, it is my 
'guess that the honorary degree. Doctor of Humane Letters, is given 
more frequently to men than to women during the Mav-June college 
commencement pageant. If this be so, again it is probably due not to 
any inborn differences in intelligence between the sexes but to the 
response of male genotypes tti a sexist environment. Be that as it may, 
the fact of the possibility of delayed development is worthy of men- 
tion for behavioral scientists w ho have forgotten that adaptatum to the 
environment is one of the ehief means of survival. 

At this point may I introduce a few findings from one of my studies 
of black students at predominantly white i ollcges These findings have 
to do with their academic adaptation. As'you might guess, most blaek 
students enrolled in the four upstate New York colleges which 1 
studied in 1969-1970 did poorer than niost white students. In terms of 
self-reported cumulative average grades, about 2"^ percent of the 
blaeks compared to 40 percent nf th^ vshites had As or Bs;. 64 percent 
of the blaeks eompared to 40 percent of the whites had cumulative 
grade averages at the C level, and 13 percent of the blaeks eompared 
to 2 percent of the whites had self-reported cumulative grades of D or 
less. Averages like these tend to mask so nukh and contribute to our 
misunderstanding aboui variability in adaptation When analyzed by 
year in school, my colleague and 1 discovered that blaek college 
seniors had better grades than black college freshmc?n. But not only 
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that; black college seniors had better grades than white college 
seniors: 52 percent of the black college seniors compared to 42 per- 
cent of the white college seniors had eunuilati\e grades at the A and B 
levels (Willie and Sakuma, 1972, p. 86). 

Do these findings mean that black college seniors come from intel- 
lectually more gifted parents than black college freshmen? I doubt it. 
Do these findings mean that black college seniors come from genetic 
pools that are intellectually superior to the genetic pools from which 
white college seniors are drawn? I doubt this possibility too. My con- 
clusion would be that the superior achievement of these black seniors 
compared with other black and white students is a function of the 
way in which they adapted to the difficult situation in which they 
found themselves. Through endurance, senior black students eventu- 
ally transcended and overcame many academic obstacles. The superior 
outcome of their endurance did not become visible until the fourth 
year. During the first three college years, the average black student 
trailed behind the average while student in academic achievement as 
measured by grades. Many black students were unable to persevere 
until the fourth year. Still the number of black students in the fresh- 
man year in these upstate New York colleges is twice as large as the 
number in the senior year, while the number of whites in each of the 
four classes is moree\enly divtded. It is probably fair to say that both 
the low achievement of black students during the first three years and 
high achievement of blacks during the fourth and last year were due 
to' environmental circumstances and adaptations thereto. Also, as 
Humphreys has pointed out, motivational components come into 
play. 

On the basis of my study (which unfortunately is inconclusive be- 
cause it is limited to cross-sectional data), I would assert that black 
students in predominantly white colleges are neither superior nor 
inferior to their white college mates, that each has an overlapping 
rang(? of intellectual capacity which is capable of making a \ariety of 
responses to dilTcrcnt cn\iri)mnent*il situations, and that black ^enlo^s 
tend to respond b> superior academic performance while black fresh- 
men, sophomores, and juniors respond by inferior academic perfor- 
mance compared to whites. Wc can understand this only if w can 
remember four principles earlier set forth b> Dobzhansky: (I) that 
"a genotype is potentially able to engender a multitude of pheno- 
types/' inoludmg those which function in superior aiui inferior ways, 
(2) that *'genotypic determinants of human personality are easily 
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obscured by the environmental ones," (3) that -racial differences are 
more commonly <luc to variations in the relative frequencies of genes 
in different parts of the species population than to an absolute lack of 
genes in certain groups/' and (4) that "environment, upbringing, 
schooling, association with other people and the manifold variations 
of individual biographies are powerful moulders of human per- 
sonality," 

Hunt and Humphreys have cautioned against treating measures of 
phenotypes-that is, measures of developmental achievcment-as if 
they were genotypic. But I am a mite unhappy at the timid way in 
which all three papers dealt with our common tendency to rely upon 
statistical measures of aggregated characteristics to get at the problem 
of biological difference (if any) between the races, particularly as it 
relates to intelligence. Even though, as a racial group, their cumulative 
average grade attainment was lower, black senior college students m 
my study performed better than whites academically. They could not 
have performed better than whites academically the last year of col- 
lege if they had not had the capacity to do so, a capacity which was 
probably present during the earlier three years. 

It would appear that 1 am making a tautological statement that 
black students outperformed whites because black students had the 
capacity to perform as well or better than whites. This seemingly 
tautological statement, however, is of value because we know that 
some black populations perform less well compared to whites. Never- 
theless, we should be reminded that even in such populations of poor 
performers the capacity to perform as well probably is present and 
that it could become manifest, given the appropriate set of circum- 
stances or motivational components. One reason for not recognising 
this fact is our tendency to rely either on composite measures of intel- 
ligence or on composite descriptions of a population. And thus we 
commit the same error committed by MendePs predecessors, that of 
treating as units the complexes of characteristics of races or intelli- 
gence rather than recogni/ing as did Mendel that what must be under- 
stood is the inheritance of separate traits and not the inheritance of 
complexes of traits. Moreover, it should be stated again and again, 
that genotypic traits may be present even though they are not ob- 
servable in the phenotype. This simple fact is frequently forgotten. In 
a gentle way, I arti trying to say that although our statistical methods 
and techniques for studying variations (if any) in the association be- 
tween intelligence and race appear to be sophisticated, conceptually- 
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especially with reference to the nature and form of heritability— sonic 
are pre-Mendelian and, therefore, dated. 

Mention of the need to disaggregate traits for the purpose of study- 
ing heritability lead& directly into a discussion of race. It is time we 
ceased the silly business of discussing variations in behavior by racial 
categories as if the races of humankind which we commonly recognize 
were pure. Indeed, racial purity would be a liability. Such a popula- 
tion would be less adaptable and less able to exploit its environment. 
Populatiorts of Negroid and Caucasoid people in the United States, 
for example, areamaginary and at best abstract statistical constructs. 
Dobzhansky tells us that the genetic structures of populations can be 
molded into new shapes through^the influences of selection, migra- 
tion, and geographic isolation, and especially the breeding of species 
(Dobzhansky, 1951, p. 15). To be sure, there have been laws against 
the intermingling of the races in this country. But historian John Hope 
Franklin tells us, "the slave woman was frequently forced into co- 
habitation and pregnancy' by . . . he/ master.'' He describes the mis- 
cegenation which went on during the slave period as "extensive 
(Franklin, 1967, p. 204)." Moreover, he indicates that there are records 
of marriages of Negro-White couples and Negro-Indian couples in 
New England during the colonial period (Franklin, 1967, p. 109). We 
know that there was considerable interbreeding between whites and 
the Native Americans also^known as American Indians when the 
West was settled (Brown, 1970). In summary, there has been a lot of 
race mixing and interbreeding in the United States. The diversity of 
inherited characteristics exhibited by the people in any public gather- 
ing is ample evidence of the extensive coh.ibitiition between all sortb 
and conditions of people over the > cars in this Lmd. IJoyd Humphreys 
is right when he states that any given individual belongs to a very 
large number of different demographic groups. It is inappropriate to 
measure intelligence as a complex of charactecistics, if one wants to 
understand something about inheritance. Maccoby's and Jacklin's 
approach of looking at speciik skills as well as the call by Humphreys 
for better human predictors arc more promising than the search fgr an 
inheritable composite. Also it is inappropriate to relate a faulty mea- 
sure of intellectual heritabilit> which Humphre>s calls a **h>potheticaI 
capacity'' to race, which Dob/ha nsky calls an abstract statistical 
phenotypc, if one wishes to understand the association between innate 
characteristics. Neither intelligence as presently measured nor race as 
presently defined are innate. Yet we persist in correlating the two and 
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thereby compound our error by making what Humphreys would 
classify as **unfair inferences about native capacities." The discussion 
about race and intelligence in the United States^ then, is so much talk 
ab^ut nothings Measures of intelligence are unsatisfactory and so are 
the definitions of race» So what is all the fuss about? 

I am inclined to believe that the controversy has little, if anything, 
to do with science. It seems to me that the controversy is a continua- 
tion of the social Darwinism in American thought so excellently docu- 
mented by Richard ^ofstadter (Hofstadter» 1955). His chapter oii 
"Racism and Imperialism" details how Americans rationalized op- 
pression of outgroups in the past as a natural development in which 
**backward races would disappear before the advance of higher 
civilizations (p. 171].'* % 

Herbe/t Spencer, William Graham Sumner, and nineteenth century 
white Americans may have believed these thoughts. They used the 
findings of population genetics as a way of putting people down as 
inferiors and explaining outgroup failures. But twentieth century 
Americans have been exposed to more enlightening thoughts. They 
know "that the physical well-being of [human kind] is a result of their 
social organization and not vice versa." They know that *^ocial 
improvement is a product of advances in technology and social 
organization, not of breeding or selective elimination (Hofstadter, 
1955, p. 204)." 

If twentieth-century Americans know these things, how do they 
continue to use social Darwinism? It seems to me. in the light of the 
discussipn above and some unpublished data which I have on a 3,*hool 
desegregation study, that social Darwinism is used today not so much 
to put down the outgroup as subhuman as to build up the ingroup as 
superhuman. Also social Darwinism now is used to explain the la< k 
of success of the ingroup rather than the failure of the outgroup. 

In a few predominantly white elementary schools in upstate New 
York which had recently received a modest number of 50 to 60 black 
children, white teachers were asked to rate the level of social adjust- 
ment for each new child m their class. New children were inner-city 
blacks transferred to schools in middle-class communities ta improve 
their racial balance and allluent whites who were new residents in the 
neighborhoods surroundmg these schools. Social adjustment was 
rated on a multi-interval scale ranging frcnii well-adjusted, fairly well- 
adjusted, moderately adjusted, to poorly adjusted Children also were 
requested to rate their degree of adjustment on a multi-interval scale. 
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Wilh reference to the well-adjusted, white teacher ratings of black 
children tended to correspond closely with the ratings which .black 
children gave themselves in this urea. For white children, however, 
the proportion whom white teachers considered to be well-adjusted 
substantially exceeded the proportion of the children who 'felt that 
way. The white teachers tended to see well-adjusted black children as 
they were; but they perceived the adjustment experience of several 
white children to be better than the white children believed it was. 

While the literature continues to grow with accusations that white 
teachers arc unmindful of the capacities of black" children, I am in- 
clined to believe that twentieth-9entury whites who have lived through 
the civil rights revolution of the 1950s and the 1960s in this country 
are not sc unmindful today. They have no need to put down blacks or 
members of the outgroup. But because of the remnant of social 
Darwinism in American thought, there is now a tendency to inflate 
the capacity of the ingroup. 

The article by Arthur Jensen on iQand scholastic'^ achievement is a 
classic example of the use of social Darwinism to explain away the 
lack of success of the ingroup (Jensen 1969, pp. 1-123). One does not 
have ic read far into that article to pick up a twentieth-century tone of 
Manifest Destiny. Read this: "The remed> deemed logical for children 
who would do poorly in school is to boost their lys up to where they 

can perform like the majority Tm% is a direct quote from Arthur 

Jensen. He goes on to say: . . this is in fact essentially what we are 
attempting in our special programs of preschool enrichment on com- 
pensatory .education [p. 3]." He develops a serie.* of questions. "Wliy 
has there been such uniform failure of compensatory programs when- 
ever they have been tried? What has gone wrong? In other fields, 
when bridges do not stand, when aircraft do not fly, when machines 
do not work, when treatments do not cure, Jespile all con\Licntious 
effort on the pan of many penon'^ to make them do so^ one begins to 
question the basic assumptions, principles, theories and hypotheses 
that guide one's effort. Is it time to follow suit in education [p^ 3; 
italics added]?" He asserts that the success of preschool and compen- 
satory programs is to develop gains in iq an<|^in s^nolasiic achieve- 
ment, -id then, in a bold attempt to stake out the public definition of 
the p' em, he states that *\>ur diagnosis should begin . . . with V* 
conce: >f tUe lo." ^ 
- Firs, question. By what line of reasoning did Jensen determine that 
the intellectual adaptation of the majority should be the determinant 
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norm for the kinds of adaptations wljich the minority ought to make? 
Remem^^^^^ my upstate New York study of black college students. 
Minonty3ia^^^ seniors functioned better than majority white seniors. 
|t:fian;arrogaht act to establish the ingroup as a model for all others 
.Ip^-foHby.^^^^^ position with reference to black youth was quite 
siinilario one advocated by Daniel Patrick Moynihan for black 
ad.ults..M^ advocated overhauling the black family because he 
./dedared,l"it out of line with the rest of the American society 
^(Mg|han^i966^ 

Again the majority is used as a model for the minority. When this 
if done, the majority may expect the minority to ignore it for the 
minority^has found survival value in the kinds of adaptations it has 
g^own.accustomed to. Moreover, minority adaptations are sometimes 
m6re:beneficial for all. 

An Honest assessment by majority and minority members of this 
society would conclude that **all conscientious effort" has not been 
expended to provide compensatory educational programs. A con- 
scientious effort would require that a disproportionate amount of 
community educational resources be devoted to educating the dis- 
advantaged. This never has been done in this nation. 

Finally, the poor and disadvantaged have some ideas of their own 
about what they would like to get out of formal education. Manipula- 
tion of their children's IQ may not be their highest priority. Learning 
how to.endure (Douglass, 1962, p. 39), and how to develop a positive 
concept of the self (Rosenberg and Simmons, 1972, pp. 21-30), and 
how to gain a measure of control over one's environment (Teele, 1970, 
p/ 367) probably are as important to the poor as gains in iQ. 

And so the great experiment failed which was fashioned by the 
ingroup for the outgroup. The affluent, majority, ingroup tried to 
make over the poor, minority, outgroup in its own image. Rather than 
accept the failure a^ an inadequately planned program, improperly 
imposed upon a population, the ingroup with Arthur Jensen as 
philosopher-king has turned once more to social Darwinism, this 
time to explain the lack of success of the ingroup. Higher Horizons, 
Head Start— they were misguided efforts to remedy the irremediable 
and not failures in social organization, according to twentieth-cjntury 
social Darwinism. 

The time has come to deal with cultural and biological differences 
as they ought to be dealt with, in the tradition of the social and be- 
havioral sciences, and not in the tradition of social Darwinism. 
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^Thft^kind^^ questions^ aadressed to me are sociologically quite* 
,cs]gnifican^^^^^^^^ six questions and five dealt with one issue— the 

'a^ujg>g^^^ away why black college seniors had better grades 

'tj|air|i^M sehiorsi Let me share those questions with ^ou 



j^t i^ouW explain|the high achievement levels of senigrs as 
eomparcd- to freshmen on the basis, in part, of differences in the 
yariations in the variances of the subgroups; that is, those who failed 
have. afreaS^^ dropped put? Also, m comparing black with white 
seniors,,ohe might be able to offer an explanation based on the view 
that those.blacks who stayed were more highly^^motivated to stay than 
white students Who stayed. The higher-scoring blacks were motivated 
by self and therefore t^lis group of blacks could be better compared 
.to-the upper portion of whites. 

0: What controls a|re used in your study for differences in grades 
for black and nonblack students resulting from differences in courses? 
Were, for example, Istudent-teaching grades considered equal to 
grades in physics, math, and chemistry courses? 

Q: What percent of the entering black student group persevered to 
the senior year compared to the white student group? Did a larger 
rate of attrition among the former leave a smaller but better adapted 
group than the latter group represented? 

Q: Are the blacks at Syracuse of middle or lower socioeconomic 
status? Perhaps there is an, operative embedded. 

Q: Aren't youn upstate New York studies vulnerable on two 
points: differences in grading methods and selective factors? 

A: Obviously, I cannot answer all of these questions in the limited' 
time/available. Asp social scientist, I pointed out that my studies were 
inconclusive because they were based on cross*sectional data. But, I 
would say this: I jbelieve my findings. 

My major poin^t, however, was not about the population of black 
students who were seniors. My major point had, to do with individual 
black students wllo were seniors. The fact that they performed well as 
seniors meant th^t they had the capacity to perform well when they 
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Pl^^^ V ! rWere:fresj^^^ though their freshmen grades may or may not 

$?ft-^^4CiKave refl^^^ capacity. 
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After accepting Bill TurnbulPs pleasant but overwhelming invitation 
to deliver this luncheon address, I felt the need somehow to pull my- 
self together, so I tried to catch my breath by doing a bit of casual 
research on the luncheon addresses of the past. One finding that rather 
shook me has to do with the first time a luncheon address got itself 
insinuated into the conference program. It was the year 1951. Imagine 
my surprise when I discovered that the conference chairman that year 
was none other than one Henry Dyer. 

Another finding that our good friend Anna Dragositz dug up for me 
has to do with the ages of the luncheon speakers. She found that their 
ages at the time of speaking have ranged from 37 to 73 years with the 
median at 57 years. This puts me comfortably above the norm at the 
79th percentile. Having no criteribn reference for that score, however, 
Fm not sure whether it*s good or bad. 

A third finding was that but of the 20 addresses thus far delivered, 
only six have had much of a connection with testing problems per se. 
Therefore, all things considered, and because I think testing problems 
are as important as any problems in the world, I thought it would not 
be out of place to express a few thoughts about so^ie of the old 
problems in testing as they appear in 1972. 1 

Hence, the ambiguous titlfe of this speech: "Recycling the Problems 
in Testing.'* The ambiguity was put there on purpose tojgive me son^e 
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room for maneuver on reaching this moment of truth. Since then, I 
have hairovved down the subject by lengthening the title. It should now 
jead: "JRecycling the Problems of Educational Testing with Special 
Reference io the Problematical Uses of Achievement Tests as Mea- 
sures oflnstructional Effectiveness in Large-Scale Testing Programs 
InVoiving Diverse Groups of People/' (I shall be curious to see what 
the editor of the conference proceedings does with' that mouthful) 



In the introductory chapter with which Anne Anastasi starts off her 
admirable selection of Invitational Conference papers (Anastasi, 
1966), she gives a good deal of attention to a testing problem that was 
very much on the minds of the tiny band of testing leaders who came 
to the first three or four conferences back in the 1930s, presumably to 
lick their wounds. They were deeply worried about the many and 
ingenious ways in which the people in the schools were mishandling 
tests and misusing test scores-especially the tests and test scores then 
being generated by state testing programs. 

. These days state testing programs are beiing transformed into state 
assessment programs with a concomitant shift in emphasis from the 
guidance of students to the evaluation of schools and their educational 
programs (Dragositz, 1971). The old problems in testing, however, 
have survived the transformation. They are 'still around; only they 
have taken on some new and rather more scary dimensions. The rea- 
son should be fairly obvious to anyone who has been keeping up with 
^the educational assessment and accountability acts that state legisla- 
tures, following the lead of the Congress since 1965, have been grind- 
ing out over the past few years. These acts reflect an explosion of 
public interest in the workings of the schools that may be historically 
unique. As a consequence, the field of education has become strewn 
with politics, and educational testing has become an instrument, if not 
a weapon, in the political process (Kirst & Mosher, 1969; Cohen, 
1970). And this means that our worries today about the mishandling 
of tests and the misuse of test scores must embrace not only school 
personnel, but also politicians and the diverse and pluralistic constit- 
uencies they serve. 

Accordingly, to do obeisance to the title of this talk by stretching a 
metaphor practically out of shape, one might say that the problenis 
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in testmg that were afflicting the schools in the 1^30s have turned out 
to.be nonbiodegradable and are therefore in need ^of a recycling job to 
preveHt tk^^ from befo^uling the streams of educlption in the 1970s. 
. .P/eaie, I did no^ say that educational t^sts themselves are 
^jnnerenti^^^ of the e^wational process, even though some of 

the mpf u^^^^ critics of testing seem so, to contend. To the contrary, 
mpst;pf us ji^^^^ i suspect— and this emphatically ir^cludes me— would 
eafnest!>['J^^ to the proposition that testing, .when rightly con- 

ceiyed anq properly handled, is absolutely indispenpble to the man- 
agement and improvement of instruction all up ar^d down the line, 
from the classroom, to the superintendent's office, to the school board, 
and even to the halls of the Congress, The trouble is that, as testing 
has become so much a part of our socio-education^l culture, more 
people thaa ever before are unaware of the problems with which 
conscientious testers themselves have been perennially concerned- 
problems that must always be taken into account if testing in the 
schools is indeed to be rightly conceived and properly handled. 

One disturbing manifestation of the difficulty is the tendency among 
the uninitiated to expect from tests harder information than the scores 
can reasonably be expected to supply. For example, a few years ago, 
when the contagion of performance contracting was in its incipient 
stage, I found myself in conversation with a government official who 
was one of the more enthusiastic proponents of the performance con- 
tracting idea. 1 tried to make the point that the gain scores on even the 
most reliable tests in reading or math or anything else were insuffi- 
ciently devoid of measurement error to justify the exact payment by 
individual student results that was then being advocated. His reply 
vyas swift and off the point. It was high time, he said in effect, that test 
makers got on the ball and began producing tests that they could 
guarantee to be 100, percent reliable under all conditions. 

Well, how does one penetrate the fantasy world exemplified by that 
kind of demand? How does one get across the shocking truth that 100 
percent reliability in a test is a fiction that, in the nature- of the case, 
is unrealizable? How does one convey the notion that the test relia- 
bility problem is not one of reducing measurement error to absolute 
zero, but of minimizing it as far as practicable and doing one's best to 
estimate whatever amount of error remains, so that one may act more 
cautiously and wisely iii a world where all knowledge is approximate 
and not even death and taxes are any longeV certain? 

Take another cxan^ple. Last year the education committee of one of 
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the state legislatires Wme up with an educatioiial accountability bill 

:• that read in partla's follows: 1 

' 1 ^ 1 * ^ 

. if4he, performaripe of any school district on any tfest approved by the 

stateiixwrd pf educ^^^^ . . does not equal or exceed the national per- 

Jorma^ce^^^^^^ for such a test^^for two successivi years, said school 

di>tfictlshal^ hot receive any further state financial assistance . . . until 

such tini^ as said ^hool district ha^ achieved such national performance 

;aVerage.(Kahsas, 1^971). 



Somebody— one hopes itiwas someone in the testing fraternity— 
.must have been suffi^ciently ptrsuasive to convince that committee that 
the proposed legislation contained some fatal psychometric, not to 
mention educational, flaws, Ifor tAe bill was not fenacted into law'. 
Nevertheless, it typifies some bf tjie'jconfusion in high places regarding 
the natureof testing and the proper ^uses of test scores. This confusion 
seems to be rooted in the almost irresistible tendency to hypostatize 
such aji essentially meaningless abstraction as national performance 
average in tested achievement. tThe confusion is further confounded by 
the notion that a test^norm cin double in brass as a "standard" of 
performance to be reached by everybody. And, of course, attendant 
on these confusions is the amazing idea that the way to upgrade public 
education is to withhold funds from', the schools whose students are 
not doing well. ^ ' 

In the old educational theology, it was generally supposed that the 
way to promote learning in the youngj was to beat the devil out of the 
pupUs. IThe new educadonal theology seems to hold that the same 
state of 'grace can be more readily achieved by beating the devij out of 
the educators. There are^^dme of us, I suspect, who would have doubts 
. about the efficacy of either bethod of'exorcism. . 

Now for a third example in a quite different domain. It comes from 
an earlier time, but it points to a number of testing problems that are 
probably more prevalent ^today than they were in the past-pro^blems 
that QUgkt to make one Wonder aboutiwhat sorts of shenanigans li^ 
hidden behind any set of test scores on N^hich one may rely for making 
educational decisions or esrecting theories about the nature of mind. 
Fqrty years ago I knew ai wonderful sixth-grade teacher (actually a 
senior colleague whose pedagogical skill^ I much admired)-a teacher 
whose ideas about education were so oldrf^shioned that she honestly 
believed she could teach her pupils to t^ink. She was quite unaware 
of the high-pitched controversies then swirling through the world of 
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psychology overithe question of nature versus nurture or the con- 
' staricy. versus the inconstancy of the IQ— just as thousands of teachers 
tpdayi^ I remind you, are similarly unaware of the same cpntroversies 
\in theirl^^^ | near n atiqn . 

THat:^ knew a thing or two about the 

afitfimetic, if the theory, of intelligence testing. She knew that if 
IXPJ FQujP^PUsh up ^the raw scores on an intelligence test while the 
pupijs^ages remained about the same, then by golly, their iq's would 
sjiow a comforting rise. She would have had a ready answer to the 
vexed question : "Can we boost the iq?" for she was routinely boosting 
it annually in her own pupils. Her method was simple. She had got 
hold of all four forms of the old Otis Self- Administering Test of 
Mental Ability, and, in all good conscience, she used the items of the 
test as exercises in a unit on intelligent thinking. This she conducted 
strictly in the drill-and-practice mode of instruction— minus, of course, 
any aid from a computer, since computers had not yet arrived on the 
educational scene. The gains in IQ she produced in her pupils were 
breath*taking in their magnitude and beautiful in their upward flight. 

Clearly, that sixth-grade teacher was not playing according to the 
rules of the testing game. iBut this was because, like many of her 
present-day counterparts, she was simply unaware of the rules. On the 
,6ther hand, had she known what the rules .were, she would probably 
Ifjl?^ thought theni an irksome constraint on what she regarded as 
effective teaching. Given her pedagogical frame of reference, she 
would have had a rather good point. For it was a frame of reference 
that included the old formula— laicly revived in some forms of pro- 
grammed instruction: '*Test, teach, test, teach, test, and teach to 
mastery.'' And where else than in those old Otis Tests would she have 
found such well-worked-out exercises for applying the formula to the 
teaching of intelligence? The trouble was she got the teaching mixed 
up with the testing. What she did not realize— and what I am afraid 
many people still do not realize— is that if you use the test exercises as 
an instrument for teaching, you destroy the usefulness of the test as 
an instrument for measuring the effects of teaching. • 

11 

This sampling of events from the remote and recent past is meant to 
be a reminder of something which is so obvious that, in our preoccu- 
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paticn with the sticky problems of test theory, wc sometimes tend jto 
forget it: namely, that testing is first and last a series oUmman transac- * 
tions and that the problems of tcsting~even many of the theoretical 
ones—are essentially people problems. 

Broadly cpnsidered, there are four groups of people who are in- 
volved in the transactions we call educational testing: the test makerSy 
the test giyerSy the test takers, and the test ifsers. From this view of the 
efntefprise, two observations can be made. First, both within and 
across the four groups of participants, there is an extraordinary 
amount of diversity in their understanding of tests and in their atti- 
tudes toward testing. And second, as mass testing has spread through- 
out the schools of the nation, it has become more and more compart- 
mentalized—that is, disjointed-;jyit^rthe result that the interrelation- 
ships among the four groups (the makers, the givers, the takers, the 
users)iiave become increasingly strained and tenuous. And the conse- 
quence of this is that communications among them are becoming more 
and more.like random events. 

It is therefore probably not going too far to say that testing, like so 
much else in this technologically interdependent society, is character-* 
ized by entropy, which, according the fourth definition in Webster*s, 
means trending toward *'a state of inert uniformity of component ele- 
ments: 'absence of form; pattern, hierarchy, or dilTerentiation ... the 
general trend of the universe to death and disorder (Webster's Dic- 
tionary, 1965, p. 759)." This may not be the most cheerful way of 
putting the case, but, in all seriousness, I suggest that, for the next 
,dccade, the overriding probkm in the universe of testing is to find 
ways of reducing the entropy by getting more adequate communica- 
tion among the human components of testing. 

One strategy for beginning to tackle this problem might be to bneak 
it up into six pieces by examining the failures of communication that 
are occurring in each of the six pairs formgd by the four component 
groups. In the rest of this talk, I shall glancd at what seems to be 
happening in just two of the pairs: the one formed by the test makers 
and the test users; and the one formed by the test givers and the test 
takers in large-scale testing programs. 

So how do we fct the test makers and the test users on approxi- 
mately the same wave-length? Among the test makers I include not 
only the people who write test items and assemble them into whole 
tests, but also all the backup personnel who plot the procedures for 
test validation, for scoring the scaling and norming, for estimating 
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test reliabilities, and so on. Behind them is the small army of test 
theorists with ihcir varioub and sometimes contentious notions of how 
all- these matters are to be arranged; 

the roster of people who contributed to the Second Edition of the 
.book Educational Measurement^ which Bob i horndike has so briN 
liaritly edited (Thorndike, 1971), constitutes a small but fairly repre- 
scnt^^^^^^^ of what I am choosing io call the test-making popu- 

lation. The test users, on the other hand, are all those people— teachers, 
school administrators, guidance counselors, parents, public officials, 
citizens' commissions, and the like—who use test scores to make 
decisions of one kind or another, but who, for the most part, have not 
read Bob thorndlke's book. 

Indeed, the very excellence of the book attests rather dramatically 
to the widening gap between test users and test makers. For, as the 
latter have become increasingly sophisticated in developing the science 
and art of educational measurement-as indeed they have— the test 
users, through no fault of their own, are finding themselves ever more 
deeply in the dark. 

The reason for this state of afi'airs is not far^to seek. It lies primarily, 
I believe, not in a deficiency of intellect among the test users, but in 
the fact that, having become so many, they arc now hurting from the 
effects of Dyer's First Law of Information Dilution, which states that, 
as knowledge expands while the population of potential users of 
knowledge also expands, the probability approaches unity that every- 
body is ignorant of what anyone else knows. In other words, the great 
majority of test users simply does not have the time to look up or 
catch up or keep up with the enormous number of tests and the 
mountainous literature on testing that the test makers continue to 
pile uj). (Even some of the test makers themselves seem to be having a 
bit of trouble in this respect.) 

The dimensions of the test-making explosion are suggested by the 
fact that the ets Test Collection now holds 680 difl'erent tests in just 
one category alone— reading. And it may be further noted that Oscar 
Euros had to expand the Seventh Mental Measurements Yearbook 
(Buros, 1972) to two fat volumes, supplemented by various interim 
publications, to accommodate the output of tests and the literature 
on measurement. 

On occasion, in order to try to help people find what I thought 
might be a short-cut to a reasonably tight definition of their educa- 
tional objectives, I have made the simple-minded suggestion to citi- 
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zcns concerned about local or state testing programs, that they would 
•do well to undertake a systematic examination of a number of existing 
achievement tests, iiem by item, to see which tests would best reflect 
the. types of behavior they were hoping for in the children attending 
thcir'schools. Almost invariably their response to the suggestion has 
bcen one of dismay. We are all busy people, they say in effect* How 
can you possibly be' so unrealistic as to suppose that we could find the 
time to.go through such a demanding exercisc-even if we knew how? 

Their point of course is well-taken. But the problem remains. How 
does one get the users of tests— especially those who use tests as instru- 
ments for determining educational policy— to know enough about the 
innards of the tests they are using to have some clear ideas of what the 
test scores are saying about what children are learning and schools are 
leaching? There has to be an answer to this question, and I think it is 
incumbent on the test makers to find it. One step toward an answer 
is probably to be found in the efforts of the National Assessment of 
Educational Progress to get people to focus attention in the first 
instance not on scores but on the way students respond to specific 
questions. Another step will be to unravel the mysteries in the criterion 
referencing of test scores in such a way that the minds of test users 
will not be forever fixated on norms. 



ill 

" Now let's move over to the opposite corner of the testing universe 
where a somewhat different set of people problems seems to be flour- 
ishing. This is the corner where the test givers meet the test takers, 
and the ancient problem of test reliability comes back to haunt us in 
new ways. 

During the last 20 years or so there has been a growing concern and 
a fair amount of research on the test-giving behavior of teachers with 
diverse cultural backgrounds as it affects the test-taking behavior of 
pupils with diverse cultural backgrounds-rich, poor, black, white, 
Anglos, Chicanos, and so on. Although the research on the problem 
is still spotty, it is nevertheless pretty conclusive in support of the 
commonsensc notion that pluralism in the test givers interacts with 
pluralism in the test takers in ways that tend to depress the reliability 
of test data. Which is to reinforce what the theorists have been telling 
us all along: that the reliability of a test does not inhere solely in the 
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testing instrument itself, but in the total testing process. That is, it is, 
to some extent, a function of the multiplicity of human transactions 
that occur inside the examination rooms where the test givers and the 
td&t:taker^ one another. The problem, then, is the very 

:j)racticftl\onc of getting a good estimate of the degree to which these 
thumari interactions are in fact influencing the reliability of the test 

Jo^put thc problem in a fairly concrete way, suppose you have a 
!air^|*»:ale testing program in reading and math in which 50,000 ^ 
i^i^hhfgrade: pupils arc being tested by 1,500 teachers in 500 schools 
serving areas that range from densely urban through richly suburban 
to sparsely rural. In this situation, how do you monitor the test- 
giving-test-taking process in such a way as to get some sort of believ* 
able estimate of the amount of error variance that will have been 
contributed to the test results by the 1,500 test givers who come to the 
task with varying hangups about the children and with varying degrees 
of.defensiveness, reluctance, and suspicion concerning the entire 
'testing enterprise? Similarly, how do you get the necessary informa- 
tion for a believable estimate of how much additional error variance 
will have been contributed by the 50.000 test takers with their varying 
perceptions of what the testing is all about, their varying^egrees of 
understanding about what they are supposed to do, their varying 
motivations^ and their varying attitudes toward hoi\) the tests and the 
testers? 

Thtis, it seems to me that it is within this congeries of human rela- 
tionships that we have to define the reliability problem as it exists in 
the real world of testing. ! think it is a problem that is badly in need of 
more attention than it has been getting from the test makers. Failing 
this, I am afraid that debates over what test data mean and ho>y far 
they can be trusted in the formulation of educational policy will,* like 
the never*ending debate touched oiT by the Coleman Repojt (Coleman 
et al., 1966), be forever inconclusive. It se^ms to me ironic that in a 
67-page paper on the quality of the data in that Report, Christopher 
Jencks devotes barely one page to the reliability of the achievement 
tests that served as the dependent variables, and even so. gives nary a 
hint of these human aspects of the reliability problem (Jencks, 1972). 

There are many other old problems in the universe of testing that 
may need some reshaping to bring them into line with the human con- 
dition. Some of them have been touched on in the papers you heard 
this morning. The others will have to be your homework for the next 
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Invitational Conference on Testing Problems. One particular matter 
that;! hope you will look into is that of recycling the test validation 
problem t^^^^ the case of those 50,000 children in 500 schools who are 
,itryingiIthrough a single test channel, to transmit to test users, with 
^.dwerse^ needs, bundles of messages about thei^ mathe- 

!,maricd^ which may differ from school to school because 

.of^thMi^^^^ creeds about mathematics to which different schools 
>%S0ifibe. » ' 

3iVmy yjew^none of the problems I have hinted at is essentially in- 
/spiuWe, ^Irideed, I'll hazard the guess that the. solutions to most of 
them areJying right now buried somewhere in the technical literature 
r-but not just the psychometric and psychological literature— in which 
case, the immediate task is to get them out of the literature and trans- 
late jthem into terms that will make them functional in school testing 
programs. 

The accomplishment of this task, however, assumes that all parties 
to t\ic test communication network can acquire sufficient NVisdom to 
break out of the minr^ :,et that perceives tests primarily as devices for 
sorting out test takers to accommodate the rigidities of educational 
institutions, rather than as instruments for loosening up the institu- 
tionssvtp^ accommodate the developmental needs of the test takers in 
ways tfeail will enable them to do something to diminish the entropy 
in this teriibly troubled world. 

In the luncheon address that Daniel Starch gave at the 1954 Invita- 
tional Conferchc. (^tarch, 1955), he asked and attempted to answer 
the questio^: "How can advances in science be made to produce ad- 
vances in wisdom?" That question, it seems to me, is as pertinent to 
the science of testing as to any other field. It reminds me of two search- 
ing questions thatwere put to me recently during a conversation with 
-a. Vietnam veteran. 
. He was a young black who had dropped out of an inner-city high 
school at gradejen and had enlisted in the Marines. This step, as he 
put it, had "saved him from the streets," It had also enabled him to 
pass the high school equivalency testf But it had given him an educa- 
tional experience— some of it rather grim-beyond anything he might 
have gotliad fie stayed in school, even though he claimed that he was 
still a slow'y^der. His two questions were these: ^ , 

First, *'D\jou think that people these days are generally smarter 
than people were in the old days?" 

I fumbled around with that one, and, thinking of the studies com- 
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}pann| Am scores in World War I with AGCT scores in 
.V^orl(| Wa^^ I -said, **Yes, people these days probably are smarter 
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Aft|^^^^^ said, "Yeah! But do you think they are any wiser than 






* ^.HaVs/a^^ problem I am leaving for your homework. 
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or Are We 
En Route without 
a Road Map?^ 
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State University of New York 
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Early childhood education (ece) may well face a new challenge in the 
1970s, particularly in its role as an agent for preparing children frrm 
disadvantaged backgrounds for con}petence in school. In the 1050$ 
ECE faced aldecline, where nursery schools were phasing out, vhere 
students weire not enrolling for teacher training; the 1960s saw a re- 
. surgence arid a headlong pell-mell renascence of ece, culminj^ting in 
Head Startjtypc programs heavily funded and introduced with con- 
siderable fanfare. The mission of ece became remedial: the way to 
break the cycle of po^^erty, reduce the chance of educational failure, 
improve the health and welfare of the newly discovered poor. Early 
childhood education was cast into the savior role-the institution that 
would eliminate the presumed intellectual, social, and emotional 
.deficits amlong children and families from impoverished environments. 
From 1965, when President Johnson dubbed the Head Start program 
a solution! to a major social problem to date, ece has been discussed, 
evaluated^ tinkered with, criticized, and in general has been the center 
of considerable professional and lay interest (Gordon, 1972). Inter- 
estingly enough, a large number of psychologists who ten years before 
might never have thought of identifying with ece, becJ'.me deeply 
involved and committed. The move from the laboratory to the class- 
room, from the observer to the intervener, from the detached scientist 

^Special thanks to Dr. Robert Pruzek for his helpful comments. 
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" to the involved practitioner, shows how attractive the movement to- 
ward tCE has been (Stanley, 1972). In spite of harnessing some of the 
best. professional talent in' the country, in spite of relatively vast sums 
of mbney appropriated to the cause and in spite of hours of devoted 
cominitmen t qn the part of parents and of professionals from virtually 
every, jreleyaht discipline, the hopes and expectations ofeCE for achiev* 
Wg.tKeLdefin have not yet been fully realized- 1 am afraid that 

. thetl?7te may^s^^ a period of disillusion set in, a retrenchment from 
the_exciting, creative period to one of skepticism, doubt, and even 
retreat from ece (Horowitz, 1972). Recent surveys and public pro- 
houncemehts contribute to the swing of the pendulum from great 
-.hopes to great disillusion (Jensen, 1969; Jencks, 1972). 

The obtained outcomes should not have been too surprising if 
careful reading of previous research in long-term remedial effects of 
ECE were done. Swift, in her review of ece, points out that results were 
inconclusive for middle-class children ; when lower-class ichildren or 
those from deprived homes were involved, IQ changes Were noted 
(Swift, 1964). This latter group of studies, however, was limited by the 
uniqueness of the study populations (orphans, abandoned children, 

' f6> example). The intent of these studies was to show the impact of 
environment in IQ terms, in an effort to support the'^ optimism of 
environmentalists, to wit: that environment is the essential determi- 
nant in development of intelligence and an appropriate environment 
can counteract the negative consequences of undesirable environ- 
ment. The current perspective in ece has been that early intervention 
can influence those attitudes, behaviors, concepts, and skills predic- 
tive of adequate school functioning. 

An increasing number of studies have been showing that these pro- 
grams have not achieved these goals, and further, cannot work. Early 
childhood education is in fact not the way to resolve the problem of 
poor school performance for these children (Jensen, 1969). Some 
critics have gone so far as to claim that ece is harmful and that formal 
schooling should be delayed even longer (Moore, Moon, 8^ Moore, 
1972). 

The challenge thrown to the advocates of ece, then, is to justify it 
as an educational experience in the face of thcs * reports, justify it in 
ierms of cost^.of time, or for reorganization of our educational in- 
stitutions in relation to outcomes. If advocates of ece fail to offer 
constructive suggestions now, a premature decline of ece could result. 

The challenge is understandable in a^society such as ours where we 
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justify Jspcial programs-from an ecoaomic perspective. The cost factor 
. iften overrides humanitarian goals, and programs however nieritori- 
)us arej,ejected as costly in terms of dollars, not in terms of reduction 
.cp|f%mM^^^^^ or enhancement of human potential. My \ o\nt is 

Jt|^ the.vd^^ of ECE as a broad-based education for all children inte- 
mainstream of the educational mainstream lias yet 
IpMMl^i^^^ ^1^^^ evaluated. In this paper, I will preseni argu- 
jmejltsjj^^^^^ ECE as a worthy educational effort. 

l:EarIyl childhood education as a large-scale national effort is ;l mere 
, decade, bid and then only geared for children from economicaljly dis- 
advantaged backgrounds. Before rejecting ece as an educational 
entity of value, let us review briefly where we were ten years agol what 
We had tQ,do, and what we have accomplished in this short decade. 

With the exception of the Montessori program, there were virtually 
no well-articulated, defined,, and worked out ece models for the group 
served.2 Interest in the educational aspect was at a low ebb. Relatively 
litltle educational research was being carried out with pfeschcfol chil- 
dren, and the studies that were done were particularly concerned with 
personality development and its antecedents. Little attention" was 
paid to learning, cognitive functioning, educability, language acquisi- 
tion—all areas that have flourished in the past decade.^ In esse^nce the 
field of child development in general and ece in particular wa^frag- 
mented, limited in scope, atheoretical. Early childhood education 
. was particularly heavily influenced by mental health considerations 
(Beilen, 1972). Jn sum, the field was ill-equipped to meet the new 
challenge— the readying of minority group impoverished children f(^ 
school. 

'^0 meet the new demand to remedy intellectual and Social de icit^ 
of disadvantaged children, a basic and fundamental reorientation for 
ECEjwas needed. New programs had to built whidh would be 
relevant; "teaching" had to be in terms of concepts, skills, anc 
haviors relevant for elementary school. ^ 

only way the field could cope with these new social demands 
was to bring together its best hunches, the little empirical knowl(xlge 
available, and much experience. In addition, crash research progrjams 
ha4 to be instituted. 



skill 
be-'^' 



T*his is not to say that ece programs were nonexistent (Bank Street College, The 
Merrill-Palmer Institute). They tended to be broadly gauged, focusing on sock lizd- 
tion rather than on cognitive and language skills. Also they were generally not 
"t5ickaged** and prepared for the mass education effort demanded by Head Stiirt. 
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The urgency to establish programs ia spite of the lack of knowledge 
dealing |with so-called deficits of>the impoverished meant that the task 
of programming the educational effort was put together with little 
time for thoughtful planning, coordinating the relevant knowledge 
from the various disciplines (education, psychology, sociology, 

. anthropplog^ biology), and pilot testing. It was a crash program 
where the social demands and services were primary. In effect, pro- 
grams were really pilot efforts, and hence these evaluation studies 
should l?e generally considered as pilot studies. Under these condi- 
tions, hpw could, we. expect that remediation plans would be clearly 
articulated and carried out to ameliorate the deficits of varying intensi- 
ties and jenable children to profit from schools in ways superior to 
their peep who did not attend such programs? In addition, the expec- 
tation that short-term gains, if any, in these programs could be main- 
tained oj/er extended time periods was a heavy burden, one that 
required cooperation of the field of elementary education— a type of 
cooperation not always received. The response to this was disillusion- 
ment with early education aHntervention as a solution to the **deficit** 
education problem. 

I am not disillusioned by the disillusionment. From the research 
perspective, the expectations were unrealistic and even naive.. We 
knew our conceptions were inadequate, our measures crude, i)ur re- 
search designs fraught with error. In spite of sophisticated techniques 
and data analytic models, the initial effort should be viewed as a pilot 
project. There are some exceptions, but even they are full of concep- 
tual and metHbdological shortcoipings (Stanley, 1972). 

We ought not to be disheartened about all of these negative re- 
search findings for reasons which 1 shall discuss. From the humanistic 
point of view, I am not disappointed, since a large number of children 
and their parents had valuable and interesting experiences, improved 
nutrition and diet, opportunities for general health examinatior 
an enrichment of parents* and children's lives. In fact they hi 
periences which otherwise would have been denied them. In siini, a 

"lot of children and parents had a lot of happy times-a break from the 
harsh realities of a life of poverty.. 

' Congress and cconomy-niinded groups, however, are not impressed 
by the^e seemingly sentimental impressions, so let me turn to perhaps 
more convincing reasons for my lack of disillusionment. 

We have before us a greater body of knowledge about early (jhild- 
hood development than ever before (Mussen, 1970). We have before 
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US a more comprehensive understanding of educational settings, pro- 
grams, material's, and teacher strategies (Gordon, 1972). For e^ample, 
ihe research of Louise Miller at the University of Louisville has shown 
How different preschool programs have different effects over varying 
lengths of time (Miller, 1972). Her research shows that ece program/ 
are no^^mqn^ in spite of the claim of the proponents. The sAi 
bo»d/oir material has contributed and can contribute to our ucruer- 
staiidiHg b^ dynamics of growth. We must not confuse cor im- 
patience and disappointment in, not creating the expected Icmg-term 
social change in such a short time with the increasing knowl^ge base. 
In fact, we are in a better position to plan now for ece ipv/gvAms than 
we .were in 1965, In addition to our increase in knowLrage, we have 
increased communications, we have established suchycommunication 
centers as eric, increased participation artd communication among 
various disciplines. Let us not overlook the potential in resource we 
have harnessed for continued efforts. Thus, we >(ave gained much by 
this recent impetus in ece. / 

I believe that the impact of ece on the cHiiaren also is greater than 
assumed, but empirical proof of this resid«^ in a reevaluation of our 
' expectations and our data, reconceptu^ization of the concept of 
impact, and a reinterpretation of evaluirtiori procedures. 

first, consider the general expectations assigned to ece. It was be- 
lieved that Ecaalone could anielior^e severe or serious deprivation— 
an unrealistic idea. Amelioration of consequences of psychological, 
social, or nutritional deprivation coupled with poor health care is a 
major undertaking requiring the concerted effort of many agencies. 
No EQE program can be so broadly conceived as to undo or redo the 
the impact of the total environment that is hostile, rejecting, and 
neglectful To attain such social goals requires drastic social changes 
in practices, attitudes, and feelings among the various ethnic and 
racial groups. 

I do not look to ece as a panacea for social ills. Rather, ece can 
provide opportunities for c/// children to engage in educational experi- 
ences which at least in the present can extend their knowledge, pro- 
vide a break from some of the everyday problems at home, and pro- 
vide a chance to extend their horizons. Such experiences do provide 
children with-9pportunities for excitement, a zest for doing, for learn- 
ing, for enjoying— and none of these are particularly undesirable or 
unworthy objectives. 

Further, it is ironic that the expectations we set for elementary 
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^school are not ak grandiose as ece. Do we worily about the long-term 
^effects of .kindergarten or grade onij? If grade one experience facili- 
'tates grade-one learning, that woulc^ be enough|. I doubt if we would 
Jtry to relate grahe one tojjunior high school performance. And if we 
'find ino relationship, would we do a|way with grade one? Why make 



jthese. types ofj demands o 
1 I feel that iht reasons 



ECE? 



we have concluded that ece programs are 

virtual failures may in part be due to the way programs were concep- 
tualized and dalta were analyzed and evaluated. In the remaining 
section of this p!aper, I wish to suggest some procedures which I be- 
lieve will reveal that we have lost much infornrntion or impact of ece 
because of how we have lapproacheci the prob em. Let us begin by 
defining ece in behaviorall terms. j j 

Early childhood education involves,engaging children in experiences 
which can influence the course of their growth.|It is essentially a de- 
velopmental is^e. However, I fail to see where educators and pro- 
grammers con(^eptualize the entire process in developmental terms. 
Evidence for this assertion comes frbm reading current theory and 
noting evaluation procedures whichj tend to jiew development as 
incremental, cumulative, and linear, oyerlookingi the dynamic integra- 
^ tiye nature of development (Werner, 1957). Narrowing of the con- 
ceptualization of development to study of change 'alone is a distortion. 
It leads to the misconception ] 

.1 

. . . that development is the kind of anarchism whe^'totally unrelated and 
discontinuous stages follow each other in sequence;! or it may lead to the 
kind of historicism where new or modified behaviors are merely added 
on in a continuous temt)Oi-al order as the organism jgrows [Langer, 1970, 

p. 733]. . • ; 

i 

Development is|a dialectical process in which thje organism's organi- 
zation becomes 'increasingly differentiated and specific. In the course 
of these processes new integrations occur, so new learnings are in- 
creased in the old and the net result is a new organization. 

In other words, think of development as a spiral: learning occurs at 
one level and ihhw is reintegrated into a new leval, not just added on. 
Learning vocabulary is not just adding^the word \n an associative wky 
but integrating the word in a context, i 

If development is conceptualized in these c^ualitative (organiza- 
tional) terms, it becomes possible to! construct measures assessing 
levels of functior ing. For example, we employ a task to determine the 
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child^s level of representation. The task is a four-stage task moving 
from the concrete to varying levels of representation: graphic, verbal, 
and so oh. Children vary in the level at which they might solve the 
problem. This type of task can express the concept of devejopment. 
Thus,. the child who succeeds in ^he four conditions is not increment- 
ally superior, but is qualitatively different on a different level from the 
, fchild yi^hp can handle only the concrete condition. I believe we need 
more tasks that are constructed in this fashion. 

Evaluation of developmental changes, then, requires procedures 
wKich tap* into qualitative, as well as quantitative, differences in per- 
formance. Changes in children's responses to Piagetian conservation 
of substance tasks are good examples of what I mean. The classical 
conservation task might well be the prototype. Let me illustrate by a 
discussion of the conservati^q^of quantity task. ^ 

The child is given two equal balls ^ clay. Once he attests to equal- 
ity, one of them is altered in shape and the child is asked if they are 
still equal in quantity. The child does not solve the problem' when 
about four or five. As he grows older, he begins to solve the problem, 
but he gives different answers, increasingly sophisticated. At first, he 
says the two balls of clay are the same because one is just flattened 
-outi later he solves the same problem but now uses a more sophisti- 
cated explanation: nothing was added or subtracted. More complex 
answers are given latter on, such as invoking principles of compensa- 
tion: as something gets flatter it gets longer, with quantity remaining 
constant. Thc'change in the child's conception is not evidenced in bis 
answer to, the initial question,!" Are these two the same?"; rather, it is 
in the explanation of his answer. Only by this method of interrogation 
does one discover the qualitative change. This is a prototype of the 
evaluation I fee! is necessary to assess effects of ece. 

Another issue when working Nyithin a developmental framework is 
to examine the interaction of particular achievements. Basically the 
question is to assess, effects of change (or lack of it) in one arealjrn 
performance in other areas. For example, what is the consequence pf 
enhancement of skills in fantasy or verbal imagery on attention? It is 
conceivable that high premiums paid for fantasy and imagery might 
lead children to move away from attending to the specific concrete. 
Interference between one act of learning on another is important be- 
cause it may well result in neutralizing potential gains. 

Accepting this perspective, investigation will have to examine the 
interaction of various tests; it will require programmers to concep- 



113 



105 



Pr^tchool Education 

tualize and evaluate interaction effects of various program compo- 
nents. Unfortunately, most programmers cither ignore the interaction 
of intra-organismic subsystems or pay lip service to it. The end result 
is that little is known about the interactive effects of emphasizing par- 
ticular program-components vis a vis others. For example, what is the 
effect of language stimulation on social behavior or task orientation 
on creativity and problem solving? 

Thus, program builders and evaluators working within a develop- 
mental framework would have to embark on componential analyses" 
\yithin an interactive analysis framework. This moves away from uni- 
variate type approaches and also away from heavy reliance on prod- 
uct-oriented tasks. Now, new analytic models have to be created that 
handle synergistic relationships and provide data on the signifi- 
cance of particular interrelationships among variables. Treatment 
effects have to be examined not in terms of change of a single measure 
—for example, IQ language comprehension- but in terms of pattern 
changes. For examplq, are there co.ncomitant changes of are changes 

* in some areas marked by no changes in others? By this type of exami- 
nation the more subtle and perhaps significant effects of the program 
can ba ascertained. 

Interpretation of the results, however, is difficult for four reasons: 
1) the type of task employed: 2) the timing of testing: 3) the situa- 
tional conditions: 4) the mode of analysis. 

Tasks frequently are unrelated to program objectives. Rather than 
achievement-type tasks. lo or similar measures are frequently used, 
not necessarily integral to program goals or content: and these mea- 
sures are not necessarily those which predict to school performance, 
Th^sc measures should serve a **swivel-chair" effect, evaluate conse- 
quences of the lici; program, an J be predictive of performance in 
school. The reason is simple. There is no true integration of these two 
educational experiences7Coniponents for each program have to be 

* identified in> variable or dimensional terms and appropriate tasks 
created. This procedure k not only expensive in time and money, but 
also speaks for a certain conceptualization. If that acceptance of the 
significance of this viewpoint is not evident, of course it will not hap- 
pen in spite of funds or other conditions. This lack of integration, I 
believe, negates a meaningful relationship between i:n:and elementary 
school. Further, it does preclude an understanding of variousdevelop- 
mental processes because developmofnt is not necessarily dependent 
on an artificial classification of children's educational status. Thus, if 
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a goal of EC£ is predictive of elementary school performance, integra- 
tion is necessary to define the impact of ece. Construction of achieve- 
ment tests for program components should be a must. 
« Evaluation measures are usually cast in a pre-post design. Examina- 
tion of process— those conditions influencing effects— has been done 
but not suffciently to identify relationship between inputs and per- 
formance. Of course, such testing calls for careful consideration of 
ways to cope with practi-^e effects, subject overtesting, and subjects 
feeling Constantly under scrutiny. There is no need to dwell on a point 
familiar to ets audiences and those of you engaged in research. * 

Third, a crucial factor influencing test performance is the situational 
factor—the context of the evaluation. Cole and Bruner (1972) state the 
case well by saying 

. . . when wc systematically study the situational determinants of per- 
formance, we are led to conclude that cultural differences reside more in 
differences in the situations to which different cultural groups apply their 
skills than' to differences in the skills possessed by the group in question 
[pp. 175-176]. 

Psychologists concerned with comparative research and comparisons 
of social and ethnic group differences in particular must take seriously 
the study of tbe way different groups organize the relation between their 
hands anci minds; without asiSuming the superiprity of one system over 
another, they must take seriously the dictum that mixfi is a cultural ani- 
mal When cultures are in competition for resources, as they are today, 
the psychologist's task is to analyze the source of the culture differences 
so that those of the minority, less powerful group may quickly acquire, 
the intellectual instruments necessary for success in the dominant culture, 
^ould they so choose [pp. 176-177]. 

Cazden makes a similar point in evaluating language studies (Cazden, 
1970). The challenge here is considerable, for wcnow have to devise 
ways of interpreting results obtained under these conditions. Now we 
have to sjeek new ways of dimensionalizing situations so we can com- 
pare performances not only in terms of scores, but in how these per- 
formances relate to the dimension of the situation. 

For example, some situations in which children are evaluated 
involve time pressure demands, where others are free and untimed. 
Differential behaviors and competencies can occur as a function of 
the'coerciveness of time. Which is the more accurate statement of. 
ability? Do we not have to qualify the statement in terms of the time 
dimension? ' 
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True, the above considerations arc often taken for granted in adult 
testing. The error committed in that instance does not justify our con- 
tinuing it with children. 

The social context, with its supports, has an impact on performance. 
How the adult treats the child relative to the child's expressed needs 
influences performance (Zigler & Butterfield, 1968). We come head on 
to the issue of standardization and the degree to which this precludes 
our maximizing information about the child's knowledge. ^ 

Fourth, modes of data analysis must avoid premature foreclosure 
by formal statistical analyses. I believe much is lost by the rush to 
quantification. At this tinic we need to inspect consistencies in the 
data through careful monitoring' of them. Relationships between 
variables can be arrived at by reflective examination of configurations. 
Identifying patterns of variables for^single individuals as well as groups 
will be conducive to generating more realistically complex statements 
of hypotheses. An example of this type of hypothesizing is as follows: 
children who are task oriented, attentive to directions, and have good 
verbal comprehension will do better on novel tasks' involving verbal 
instructions. . • 

The evaluation of the child must include his experiences if we arc to 
understand the dynamic relationship between act program and skills, 
attitudes, and concepts necessary for later schooling. To do this re- 
quires that we reexamine our view of the nursery school itself. If we 
consider the child a dynamic organic whole, so, too, the nursery 
school. It is by its very nature an organism-that is, a system, open and 
fluid. The preschool as an organization is made up of components 
which interact and intersect. The ^major components, grossly de- 
lineated are: actors (teachers, parents, children): materials (books, 
paints, blocks, trips, and so or): space (rooms). Each set of actor's 
roles are predefined. For example, teachers employ strategies with 
materials in the space to efi"cct certain outcomes with childrc^n; the 
-children are the 'learners" reacting to or perhaps initiating teacher's 
behaviors in the same physical setting. The parents' roles are more 
variable depending on the objectives oT the program, but still with 
definable limits. These roles are rarely overlapping: teachers are not 
child/en: children are not teachers: and parents will of course vary- 
but in fact will not be teatliers in the same sense as the teachers. Each 
of tj)ese is in a complex organization of definable and partitive wholes. 

The object of this type of analysis rests on the assumption that the 
better we understand the child's place in the system-that what in- 
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fluenoes hiifi and what he influences, influences ih.. system-the better 
able will we be io evaluate impact and significance of program com- 
ponent^ individually or in their unity. Acceptance of this agreement 
requires crejition of methods to assess the situation longitudinally and 
interacttonally because we are dealing with a continuously changing 
organism whose change also influences the system. 

When one examines the problem from ^ developmental perspective, 
it is necessary to identify significant relationships at any given time 
and follow this through various transformations, for each segment 
impinges"on subsequent ones. For example, once a child masters 
color concept, such mastery may influence his performance on puz- 
zles; in effect, each level (or segment) of achievement sbts the stage for 
a subsequent performance. Achievements, however, iarc not always 
so contiguous. For'example, learning colors and improvement in 
hand-eye coordination tasks may result in more complex block build- 
ing structures in the present, or later^a week,' a month or a year, 
thus, change in behaviors as a function of EceK^ot necessarily and 
directly one to one, but may be indirect or long tc^w. To understand, 
then, the impact of such variables requires the syst§hiatic develop- 
mental approach^ advocated here. In a sense, we can learn from the 
medical model which argues for synergistic effects of phenomenon as 

^ well as side effects. Ironically this was an approach employed by 
those working from a psychoanalytic framework. Perhaps, as we 
moved toward quantification we moved away from this. With our 
more sophisticated techniques and computers, it may now become 
possible for our conceptfon and our data analysis procedure tb be 
more closely alignerf./ 

One of the potential outcomes of this type of analysis is to. enlarge 
OUT knowledge of human development since it now becomes possible 
to examine intra- and interorganismic relationships. Granted the 
complexity of the problem, we have a source of data which would be 

'tapped as a resource in contributing to solution of such critical de- 
velopmental issues as effects of early experience on later development, 
pinpointing particulars in this sphere. 
I have wandered far.and wide to explain why I am an advocate of 

. ECE. Until the data are'in, I see no need to be disillusioned. I feel ece 
can teach us much about child development; it can provide valuable 
experiences for children, today— now. It can also have an important 
influence on education as a whole^^Integrating ece into a school system 
may also influence subsequent school organization, curriculum and 
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Anderson and Shane (1972) say that 

If ^fe /' ' / Tte:"do^^^ effect of ece on all subsequent schooling must not be 
^^^SC; :)^ _ --^MJ^^iu^k^ ^yP^ of clients who enter the primary program 

^y^lC ,^ V V ,Jp|ob|biyIw^^^^ changes and mutations that will ripple up- 

|{v|fev.t ;>Iwar85tlS6ug^ school years into the secondary 

ll^ ECE influence the school situation, it also can have 

llf SC." j^p||t^pn It can provide opportunities for children to 

^ cared tptln ^ protected situation while parents have their opportuni 



tics^fbr eM^riqi their own lives. Granted this is what Day Care is 



,jftli,a]b^\it;.for *pe. Day Care' should incorporate, an educational com- 
Jponem^^^^ It does, it becomes a n^CE program, 
^^^r^j ' / lS(3sECE. contributes on a number of fronts: scientific, social, and 

^M^l'-' .'^ Before concluding; However, let me leave you with a dream— a 
'^^^^ .pleasant onc.which would be the marriage of avartt garde methodolo- 
^ g!stsjtbi^^^ubstantive developmental theorists yielding broad-scale re- 
search models for ece. I do not believe the past efforts were adequate 
in.time and resourfeps to accomplish the crucial tests. Efforts were' 
aborted resulting in too few, if any, mature offspring. If we are serious 
about den^onstrating ece as a viable educational system with denot- 
able consequences, we need large-scale, well-supported research. 

I hope I have offered suggestions which can be implemented. I have 
taken the liberty of including an appendix to demonstrate how I have 
ijjl tried in my own program with two-year-olds to practice what I preach. 



if,t! APPENDIX 

If::; Conceptual Analysis of Early Childhood Education Project 

" ; (SUNY/ Buffalo, New York) 



The discussion will be concretized by a presentation of the ^Early 
Childhood Education Project, of which I am director at the suny/ 
Buffalo. Briefly, the prog"ram was begun in 1969 under the auspices of 
Office of Economic Opportunity support. Children were enrolled in 
1970, at age two and have been in the program ever since. Now to the 
analysis of the program in peirhaps over-simplified system terms. 
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J^^ theoretical base derives from a Piagetian orientation, with 
"particular concern for the acquisition of representational competence, 
competence is defined as the ability to con- 
^ the environment in terms of symbols and signs 

. W^ ^P^^^' objective, then, of the program is to 

'^^^^f ^J^^ch would foster such\Ievel6pmeRt, because rep- 
tlilS^ii®?^ basic to thought ahd adaptation to the 

R^.^l-!'^^^ ^^^^ a^'^as of interest are reading, translating 

|hp!cM'events and objects into signs and symbols, awareness of the 
pngfuenc^^^ equivalence of pictorial and linguistic representations 
of identical physical objects. Given these assertions and definitions of 
the phenomena, the next step was to construct conditions which 
would foster opportunities for such behaviors to emerge. Still thinking 
on a theoretical.base, I began to try to reconstruct the life experiences 
that might activate, maintain, and facilitate the representational p'-oc- 
esses. What I began to do was to extrapolate from the literature, and 
reflect on that plus my own experiences. These reflections led to the 
idea that the critical, experiences that activate representational skills 
can be described as distancing experiences, "those which serve to dif- 
ferentiate the environment in time (present-past-future), in "space 
(here-not here), and in appearance as opposed to reality (observable- 
inferential). Such diff'erentiating experiences have been termed dis- 
tancing since distahce is .created between subject and object (Sigel, 
1971, p. 26).*' The concept of distance as such is not new. Werner and 
Kaplan (1963) use it in discussing lan^age and symbol formation, 
and Piaget uses it once. None of the authors use it in the same sense 
and for the same purpose as proposed here. 

Before proceeding to detail the building of the model, I should in 
passing point out that inherent in the above statement of the problem, 
a number of assumptions, unstated or unasked, are made, such as: 
representational competence is a generic competence and is learned; 
experience activates an organism; there is a legitimate distinction be- 
tween subject and object. I accept these assumptions without further 
debate. Let us leave these comments, then, as signals for further con-, 
sideration and return to the developing ff^jogram. 

The articulation of the above assumptions is more than an academic 
exercise or pretension. Rather, it is a must, because they provide the 
theoretical base from which to derive the logic of program teaching 
strategies (activators), materials (appearance-reality), organization of 
the classroom (time and space). By clearing tijese issues I am now 
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ready to Undertake the practical problem of training people in the 
teaching strategies (teacher training), build a curriculum (ordering of 
events and materials in the program), and place it into temporal-spati^l 
contexts These then become, on the gross level, the components of the 
system. From here tfie interaction of these components will play out 
thciii^r^^^^^ control can only be exerted within certain limits and 
undertcertain conditions, 

This mpdei, however, is incomplete and specialized because it has 
not taken into account other organismic variables; for example, the 
role of affect, particular linguistic conventions, the quality of support, 
and reinforcement of articulated representational behavior, Smce 
these variables were not spelled out. the teachers were not certain how 
to respond to each of these and still be consistent with the system. For 
example, the teachers had no trouble with the cognitive dimensions of ' 
the teaching strategy, but they asked: How does one maintain consis- 
tency between cognitive interactions and disciplinary-management 
issues? Is it possible to discipline or manage a child in such a way as 
to undo what one is attempting to create in the cognitive sphere? If 
one teaches that class membership is arbitrary, for example, dependmg 
on what attribute is selected as criterial. how does this fit with arbi- 
trary statements of right and wrong made by the teacher in course of 
interaction with peers? Further, how does one scale the teaching 
strategies and all the rest of program in accord with the child's capa- 
bility to assimilate such experiences? 

The point that must be underscored is that the translation of theory 
into practice requires a series of careful steps of derivation, V have 
gone through the process, finding that the teacher, while understand- 
ing the theory and the distancing hypothesis, still needed much guid- 
ance in developing appropriate teaching strategies. 

To know that we arc accomplishing our program objectives requires 
an evaluation of the child's representational ability. Evaluation is in 
terms of direct measurement k)f achievements. If inputs in the pro- 
gram are to foster representational skills, then we measure represen- 
tational skills. We do not measure IQ. or high jumping, or other more 
irrelevant matters. The program objectives include skill in imitation 
(immediate and deferred), imagery, use of referential language as 
classification, anticipation, transformation of three-dimensional ob- 
jects into equivalent pictorial and verbal forms. For each of these a 
task was constructed or borrowed from existing tests. In this way, I 
am testing the degree to which the child could handle problems in a 
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new setting with unfamiliar materials, but which involved the same 
type or process engaged in, in the nursery school. In effect, he was 
given -achievement tests, a standard way in education to assess 
whe&e^^ is profiting from the curriculum. 

iN^'WC.come to a most complex task, data analysis. Recall, we 

{J^MMS-^M^ array of tasks, testing his ability in various com- 

fpdn^nBfof^r^^ competence. 

* InJ^diUon, data gathered includes observations of child behavior 
in & and tutorial .sessions, rating of the children by the - 

JeachersVand observation of the teachers. 

With the help of Dr. Robert Pruzek^ we are developing a system for 
organizing our data so that they can be visually examined as configu- 
nitiohs at the level of the individual child, not just as summary data. 
This system allows considerable flexibility to construct, clusters of 
dat^ defined in various ways, for various modes of classification and 
at multiple time points. Through analysis of various clusters and 
comparative analyses between individuals we are in a position to 
establisfi hypothetical relations for eventual testing the task? This type 
of approach is motivated by the felt need to incorporate the systemic 
variables. The procedure facilitates understanding of intra- as well as 
interindividual relationships in this context of the nursery school. By 
using logical analysis to form clusters of variables and individuals, we 
are, in effect, generating structural hypotheses about interactions of 
individuals and variables. 

Finally, I raise one critical issue in all this, and that is the problems 
related to the sample of children and the design of our experiment. 
Suffice it to say that we are all plagued with problems of sampling, of 
sample selection, of sample characteristics, and sample size. Messick 
and Barrows (1972) tell us not to despair because quasi-experimental 
design models are possible— advice we fpllowed the best we could. 
Not only is this a possible solution, but it is only a step in the move 
toward more replication— which in turn calls for a greater integrated 
comparative effort. 

It is time we planned in this direction instead of crying about what 
was and bemoaning the fact that our educational intervention is not 
as powerful as we. would like. 



'Associate Professor, suNv/Albany • 
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Itcan be argued that the central dynamic of American social history 
is rooted in group conflict and the efforts of the new nation to achieve 
a. workable reconciliation of diverse an4 heterogeneous group in- 
terests. From the very outset, education was not only a critical com- 
ponent of this process of adjustment>but was, itself, deeply affected 
by the^pluralism of group needs and group desires. 

As much of the history of America from the 17th century up to this 
v^ry day indicates, heterogeneity among peoples did not and does not 
imply equality among the tontending'groups. Taking full advantage 
of the superior technology of England and Europe, dominant groups 
in Colonial America saw education as a powerful tool with which to 
buttress their self-esteem and their social position. Recognizing but 
fearing t^e wide pluralism of society in the colonies, education^ became 
a prime instrument of cultural change in the direction of greater 
homogeneity. Catholics and Jews as well as Africans an^l Indians were 
viewed with much apprehension in the wilderness colonies of early 
America.. \ 

But it was the use of education as a dynamic instrument of social 
change with respect to the Indians that created a pattern, the vestiges 
of whidh continue to complicate education for American minorities 
to this very day. While the effort to educate and convert x^t Indian to 
European ways of life failed, it left what one observer calls 
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an inerradicable nu irk on American life. It Had introduced the prbblem of 

gioup relations' in i society of divergent cultures, and With it a form of 
. action tM gave a new dimension to th^ so(|ial role of education. For the 

Wf-<inscious, I deliberate, aggressive use of education,, first s^n in an 
. improvised but confident missionary campaign, spread throughout an 

Increasingly hctwogeneous society and canie to be accepted as a normal 

fbrin-of cduralio^^ effort [Bailyn, 1960, pp. ^-39J. 

Thii tehdency to treat the education of nlinorities as aj process of 
conversion to Korth European life styles is central, to the problems* 
cohfrAnted b6th by black colleges and l)lack studies' programs. 
Reared upon the fundamental assumption t lat there is nothing posi- 



tive in the group life of minorities to which 



an educational strategem 



can Be fruitfully related, most white sclola^s— from Charles Darwin, 

to Gunnar Myrdal, to Daniel Moynihan and David Riesman-have 

seen only pathology in Afro-American history and culture, Thus, 

blin(|ed by the lights of Northern Europe, Gunnar Myrdal (1964) in 

his monumental work. An American Dilemma, writes that: 
I ■ 'I 
The whole Southern Negro educational {structure is in a pathological 

state [p. 951]. j?^ 

The challenges faced by /black colleges and by the supporters of 
black studies programs ar^^ comprised, in large measure, of this cul- 
tural myopia on the part of^yhite Americans and Anglo-conformist 
black Americans. The challehge to their very existence has been made 
doubly difficult in recent decades by the fact that racial integration of 
the schools has been conceptually elaborated within the framework of 
American constitutional theory which gives no legal recognition to 
group needs or desires. As Milton Gordon (1964) has observed: 
"From the legal point of view, there are 190 million discrete American 
inkividuals [p. 4]." Therefore, the racial integration of the schools and 
o^ the colleges and universities tends to be understood as a process of 
accepting "qualified" black and other minority individuals into la/jgely 
>yhite educational structures. Supported by the major philanthropic 
foundations as well as by the federal government, ''talent searches" 
for outstanding black students and black faculty have been organized 
and conducted on behalf of the predominantly white universities but, 
to my knowledge, not a single such talent search his been supported 
or organized on behalf of black institutions in the interest of the racial 
integration of the black schools and universities. The whole process 
Of attempting to integrate our schools and universities has beefn 
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tainted with the tincture of a subtle form of white liberal racism which 
"accepts** black individuals but views the black heritage and blaci^ 
institutions as distinctly inferior. This racism of white liberalism has 
b^en crassly exploited by both northern and southern conservatives 
and has led to>the humiliation of black educational leadership all 
acro^ the soiith in the name of the very process of change that was 
originaiiy intended to provide a larger measure of dignity tQ black 
pcoi)le in this country. 

There was a time— and not so long ago either— when black people 
.and their allies felt that the twin forces of urbanization and the de- 
struction of the legal and quasi-legal pillars of segregation would 
lead to a meaningful increase of opportunity for all black people. 
Leaders in^the black colleges felt little fear or anxiety about these 
processes for they naively assumed that efforts to desegregate our 
society would make fulj use of the experiences and resources of the 
black colleges as well as of other institutions in our society. They were 
therefore unprepared for the assaults upon them that were led by the 
Riesmans and the Jenckses and the Kenneth Clarks. These observers 
viewed the black colleges as being purely **the products of white 
supremacy and segregation (Riesman & Jencks, 1968, p. 469)/' and 
therefore as ^'historical anachronisms'* in the new order of an emerg- 
ing, integrated society. 

^Nonetheless a growing number of black and white observers iiave 
begun to make a closer analysfs of the strategy and objectives of the 
civil rights movement and of the impact of urbanization upon the 
patterns of American race relations. One result of more recent social 
science research on the general question of how people act in groups 
and, more specifically, in situations of interracial contact, has been to 
shift away from the atomistic conception of society that tended to 
ignore the structural and collective dimensions of intergroup behavior. 
Thus, a growing body of social scientists are giving less attention to 
"the authorita>ian personality," or "the mark of oppression/' or even 
"the nature of prej;idice" as meaningful frontiers for social or psy- 
chological research. Most research efforts today are based on the 
theory that society is not ^n aggregate of discrete individuals but a 
mosaic of special interest groups differing in prestige and power, led 
by initiators of action who make critical decisions. In this view, 
individual personality problems arc related more to institutionalized 
and culturally accepted social patterns than to discrete ego needs. 
This shjft in focus has brought into view what some call a "new 
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frontier of race relations." The central reality on this **new frontier'' 
is the behavior of institutions^ The concept of ^'institutional racism'' 
(see especially Knowles & Prewitt, 1969) has achieved prominence 
among some social scientists and along with it the concomitant prob- 
lem of how to achieve the social conditions which make possible 
group attainmennTor American minorities^ Within this approach, 
social policy analysis is directed toward the issues and problems of 
"deracinating" white institutions by transforming them into agencies 
of minority group'' achievement rather than instruments of nonwhite 
oppression. More importantly, this perspective is able to identify and 
embrace positive elements in the history and culture of minority 
groups. The potentialities for power and strength are emphasized 
more than the elements of pathology and weakness. The salient point 
is the focus upon institutional behavior and practices with the objec- 
tive *of enhancing group achievement rather than looking mainly at 
individuals. As sociologist Earl Raab (1962) observes: 

tl|e formula may have to be reversed* . . extended individual opportunity 
nlay depend finally upon group achievement. This is, hypothetically, the 
new frontier of race relations [p. 20]. 

/These developments in the realm of social theory have appeared 
almost concurrently with a growing mass of data which show that the 
relative position of black people in the economy improved somewhat 
between 1959 to 1966. while the absolute situation declined. Similarly, 
Andrew Brimmer, among others, has shown that while the absolute 
and relative situation of middle-income blacks is getting somewhat 
better, that of low-income blacks is getting worse. There appears to be 
developing a growing under-class of black people who are failing to 
benefit from any of the changes in the American society. 

The impact of the increased urbanization of black people in the 
country is even more depressing. I know of no large-scale public sys- 
tem of education in any one of the major urban centers of our country 
that has successfully come to grips with the problem of educating 
large numbers of black and other minority youngsters. Nor is the 
physical isolation of the urban Negro diminishing. In almost every 
big Northern city the proportion of Negroes attending all-Negro 
schools is rising, despite son^e ingenious and sincere efforts to arrest 
the process. In housing, more Negroes are moving to the suburbs, but 
many of them have moved to what are, or will soon become, all- 
Megro sections. 
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This quick survey of some of the sociological and economic facts 
affecting blacks in our. country today forces upon us the conclusion 
that while legally enforced segregation is waning, it is being supplanted 
by an ethnic-class system in which group solidarities persist in such 
institutions as the family, the church, social clique$^nd in the local 
community, supported now by custom instead of Iau[, while there is 
relatively free competition and participation in the\economic and 
political systems of our society. 

Confronted with'these among other seemingly intractable facts of 
history and culture, a growing number of black thinkers have re- 
opened with fresh intensity the old debate regarding the purpose and 
strategy of education, for black Americans. Disillusioned with both 
the conceptual depth and the personal^eaning of integration as an ^ 
intellectual construct and as a social reality, they are giving increasing 
attention to the internal dynamics of the black community— its prob- 
lems, strengths, and potentialities. Black higher education is one of the 
foremost of their concerns, particularly as expressed in the realities 
and the potentialities of the black colleges and of black studies 
{>rograms. , ' ' 

Philosophically, these black scholars take the position that the 
black experience is a fundamentally human experience and, as such, 
contains the ingredients to support a broad Hinderstanding of the 
nature of things. In its most sophisticated j^xpression, this view has 
close kinship with the philosophy of Alfred North Whitehead (1948): 

There is no parting from your own shadow. To experience this faith is to 
know that in being ourselves we are more than ourselves: to know that 
our experience, dim and fragmentary as it is, yet sounds the utmost 
depths of reality [p. 20]. 

Within' this philosophical perspective, black colleges and black 
studies programs are related only in part to the necessities of segrega- 
tion and of urbanization. It is true that black colleges emerged at a 
time when black people were not welcomed in any of ^he colleges in 
the South and in few of the institutions of higher learning in the 
North. But these colleges are viewed here as more than mere reactions 
to white racism. A major impulse behind their formation is seen i^s the 
desire on the part of black people and their allies to, as W. E. B. 
DuBois put it, develop 

centers of a new and beautiful effort at human education [quoted by 
Drake, 1971, p. 847J. 
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Whether this rcfocusing of vision will aid the struggle for black 
freedom will depend very heavily on just how faithful it is to the 
"regimen of fact and logic" in the black community and how sophis- 
ticated it is In the use of its awareness that the issues of black educa- 
tion in America arc not apolitical but arc deeply^embedded in histori- 
cal as. well as contemporary group conflicts between white and black 
Americans. , 

Very few Americans have a comprehension of the historical signifi- 
cance of black colleges and, therefore, have very little intcllecjiual 
leverage on the contcniporary problems these institutions face, and 
only the scantiest preparation for beginning to understand the signifi- 
cance of the m^ycment if> develop black studies curricula especially in 
the white universities. It is not that this field of experience has suffered 
from neglect but, as Professor Bernard Bailyn (1960) observed regard- 
ing studies of the early history of American education, / 

from the opposite^ from an excess of writing along certain lines and an 
almost undue clarity of direction ... the f^^or at least a great quantity 
.of them, are there, but they lie inert; theylorm no significant pattern 
tp.4]. ' ' ^ 

Regrettably, the contributions of these institutions to the growth 
and development of American society are seldom stated. For example, 
it was these, institutions that, almost single-handedly, advanced a 
totaljy illiterate people to over 80 percent literacy in the short span of 
time between 1865 to 1*930. More recently, black colleges awarded 80 
percent of all of the undergraduate degrees earned by black people in 
this country in 1968. Over the past four years, the black colleges have 
graduated an estimated 200,000 students, 80 percent of whom have 
* entered into professional fields.of endeavor. A report by the Carnegie 
Commission on Education (De Costa & Bowles, 1971) sums up this 
contribution most cogently: 

with few exceptions, whatever the Negroes have achieved in the way of 
professional entry lias been achieved through the Negro Colleges [p. 197J. 

If one but recalls briefly the tremendous .odds against which these 
institutions had to contend from their earliest inception up to the 
present day, these and their many other contributions appear to be 
almost miraculous. 

The typologies of Ernst Troeltsch (I960) and .Max Weber (1947) 
have provided us with the concept that institutional types represent 
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end products of pirticular social movements. Originating in some 
fonn:of poup conflict, and led initially by charismatic personalities, 
MKtai movements over time develop the familiar panoply of bureau- 
critic modes of organization ai^d seek, thereby, to stabilize and make 
pehnanent the goals and interests of the original movement. In this 
yic^. Mack colleges represent the end product of the movement for 
the-emancipation of l>Iack peopje and their efforts to achieve full 
'dt£censKip human dignity. Similariy, black studies programs can 
be vkwed as an end product of the urban rebellions of the decade of 
the 196bs and the efforts on the part of students who constituted the 
t'*black surge'' into the white universities to institutionalize, confirm, 
and validate their urban backgrounds as a firm part of the university 
isetUng (see also Kinnison, 1972). 

.! the future of black colleges and th^ future of black studies pro- 
grams will be much brighter^ and the education of all Americans, 
much sounder, if the larger society can begin to look upon these two 
centers 0/ black educational experience as legitimate social institu* 
tj^ons. In part, this is nojt done presently because not enough Americans 
understand the role of group power in American life. The myih of the 
;'melting-potV has obscured for too many of us the critical role of 
group power in the adjustment of white emigrant groups in this 
country and has therefore left far too many entirely unprepared to 
accept the necessity for such strength to develop among black Ameri- 
cans. Black colleges are important symbols of black aspirations and 
black pride. They function to meet critical psychocultural needs as 

>welt as to provide basic educational opportunities for Mack young- 
sters. Given the proper amount of support and understanding, they 
can be powerful allies in the more general search for ways to provide 
a higher education to large numbers of young people from minority 
groups. 

Similarly, it is difficult to overestimate the importance of the move- 
ment for college-based study of the role of black Americans in the 
social, political, cultural, and economic development in the United 
States. Black studies departments represent efforts to institutionalize 
this task. Such programs and departments are fraught with many 
difficulties and many mistakes have been made. Clearly the efforts on 
the part of some people in such programs 10 turn the black experience 
into a mystique that is so unique it requires a black skin to compre- 
hend it, are simply fundamentally wrong. If the black experience is a 
human experience, it can be Understood and appreciated by persons 
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from the f>iU range of backgrounds of the family of man. Just as 
^ clearly, itwill not be accepted,as a serious intellectual discipline if it 
simplyoecomes the latest accession to the catalogue of urban, ethnic 
p^iiics. The effort to institutionalize these programs into various 
^mds of study units and departments represents a realistic assessment 
of the university as a setting in which it requires organized group 
pressure to sustain group needs and group interests over any reason- 
• able period of time. 
• Both black colleges and black studies programs represent efforts on 
the part of black people to see their history and t^ieir culture confirmed 
and validated with some degree of permanence through the process of 
institutionalization. The eloquent words of Lerone Bennett (1965) are 
pertinent to the conclusion of this^paper: 

Institutions are great social pools in which men see themselves and their 
ideals reflected. They are instruments with which men come to grips 
^with the questions: Who am I? Where da I belong? Without meetings* 
without rituals, without ceremonies, myths and symbols, men cannot 
define themselves or enter into real relations, with others. American 
Negroes, recognizing this, attempted first to enter institutions formed for 
Americans— and were rebuffed. They then embarked on a perilous 
journey of self-naming, self-Iegitimatization, and self-discovery [p. 43]. 

These tasks must continue. 



Questions and Answers 

Q: How do you explain the separatist attitudes which seem to be 
reenforced among many blacks on predominantly white campuses 
with established black studie&,.programs resulting from demands of 
black students? Do you see this as a positive spinoff or an accompany- 
ing problem to be reckoned with? If the latter is so, how would you 
suggest that one cope with it? 

A: I think Chuck Willie's new book on black students in white col- 
leges speaks very well to this question. It seems that as the number of 
black students on the campus increases, so does the tendency to form 
separatist enclaves. This seems to have something to do with the 
quality of the institutional environment insofar as it appears to black 
students not to be receptive of them as whole persons. They perceive 
the basic environment of the white university as hostile and they with- 
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draw into the tents of their own group for solace and comfort. 

This is a positive spinoff io the extent that it contributes to group 
self-reliance. Clearly, it is also a defense mechanism. On the negative 
side^ it clearly does not enable the youngster to take advantage of 
some of the real benefits that would be his or hers^ were the environ- 
mentand the black students able to achieve a better relationship. 

Q: What long-term pluralistic model do you have to offer to com- 
pete with the model of universal fraternity and equality? Please 
specify. 

A: Well, I'm not offering a model to compete with universal equal- 
ity, universal fraternity. What I am offering is a pluralistic approach 
to education that makes a genuine equality and fraternity possible. It 
is too easy to enshrine education in mystical ideals about democratic 
vahies while at the same time continuing a program of cultural domi- 
nance based upon a narcissistic preoccupation with European cultural 
values. Can one really love a black student when the ecology of one's 
emotions and perceptions devalue him culturally? I think not. It's 
always easy to love man in the abstract, but it's very difficult to love 
your neighbor. It's easy to love everybody, but it's very hard to love 
that black kid who's in your class. And I suppose I'm specifying a 
model that contends with that. I think I'm saying, insofar as institu- 
tions go, and the effort to achieve equal opportunity or equity for 
minorities in higher education, that there's a role for a variety of insti- 
tutions and a necessity that curricula reflect the pluralism that is 
America. This is a pluralistic society. I think it's time that many of us 
recognize this, and try to develop pluralistic sets of institutions to 
deal with i(. 
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